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MODEL  VARIATIONAL  INVERSE  PROBLEMS  GOVERNED  BY 
PARTIAL  DIFFERENTIAL  EQUATIONS* 

NOEMI  PETRAt  AND  GEORG  STADLERt 


Abstract.  We  discuss  solution  methods  for  inverse  problems,  in  which  the  unknown  parameters 
are  connected  to  the  measurements  through  a  partial  differential  equation  (PDE).  Various  features 
that  commonly  arise  in  these  problems,  such  as  inversions  for  a  coefficient  field,  for  the  initial  con¬ 
dition  in  a  time-dependent  problem,  and  for  source  terms  are  being  studied  in  the  context  of  three 
model  problems.  These  problems  cover  distributed,  boundary,  as  well  as  point  measurements,  dif¬ 
ferent  types  of  regularizations,  linear  and  nonlinear  PDEs,  and  bound  constraints  on  the  parameter 
field.  The  derivations  of  the  optimality  conditions  are  shown  and  efficient  solution  algorithms  are 
presented.  Short  implementations  of  these  algorithms  in  a  generic  finite  element  toolkit  demonstrate 
practical  strategies  for  solving  inverse  problems  with  PDEs.  The  complete  implementations  are  made 
available  to  allow  the  reader  to  experiment  with  the  model  problems  and  to  extend  them  as  needed. 

Key  words,  inverse  problems,  PDE-constrained  optimization,  adjoint  methods,  inexact  Newton 
method,  steepest  descent  method,  coefficient  estimation,  initial  condition  estimation,  generic  PDE 
toolkit 

AMS  subject  classifications.  35R30,  49M37,  65K10,  90C53 


1.  Introduction.  The  solution  of  inverse  problems,  in  which  the  parameters 
are  linked  to  the  measurements  through  the  solution  of  a  partial  differential  equation 
(PDE)  is  becoming  increasingly  feasible  due  to  the  growing  computational  resources 
and  the  maturity  of  methods  to  solve  PDEs.  Often,  a  regularization  approach  is 
used  to  overcome  the  ill-posedness  inherent  in  inverse  problems,  which  results  in 
a  continuous  optimization  problem  with  a  PDE  as  equality  constraint  and  a  cost 
functional  that  involves  a  data  misfit  and  a  regularization  term.  After  discretization, 
these  problems  result  in  a  large-scale  numerical  optimization  problem,  with  specific 
properties  that  depend  on  the  underlying  PDE,  the  type  of  regularization  and  on  the 
available  measurements. 

We  use  three  model  problems  to  illustrate  typical  properties  of  inverse  problems 
with  PDEs,  discuss  solvers  and  demonstrate  their  implementation  in  a  generic  fi¬ 
nite  element  toolkit.  Based  on  these  model  problems  we  discuss  several  commonly 
occurring  features,  as  for  instance  the  estimation  of  a  parameter  field  in  an  elliptic 
equation,  the  inversion  for  right-hand  side  forces  and  for  the  initial  condition  in  a 
time-dependent  problem.  Distributed,  boundary  or  point  measurements  are  used  to 
reconstruct  parameter  fields  that  are  defined  on  the  domain  or  its  boundary.  The 
derivation  of  the  optimality  conditions  is  demonstrated  for  the  model  problems  and 
the  steepest  descent  method  as  well  as  inexact  Newton-type  algorithms  for  their  so¬ 
lution  are  discussed. 

Numerous  toolkits  and  libraries  for  finite  element  computations  based  on  varia¬ 
tional  forms  are  available,  for  instance  COMSOL  Multiphysics  [10],  deal. II  [ 4],  dune  [5], 
the  FEniCS  project  [11,20]  and  Sundance ,  a  package  from  the  Trilinos  project  [17]. 
These  toolkits  are  usually  tailored  towards  the  solution  of  PDEs  and  systems  of  PDEs, 
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and  cannot  be  used  straightforwardly  for  the  solution  of  inverse  problems  with  PDEs. 
However,  several  of  the  above  mentioned  packages  are  sufficiently  flexible  to  be  used  for 
the  solution  of  inverse  problems  governed  by  PDEs.  Nevertheless,  some  knowledge  of 
the  structure  underlying  these  packages  is  required  since  the  optimality  systems  aris¬ 
ing  in  inverse  problems  with  PDEs  often  cannot  be  solved  using  generic  PDE  solvers, 
which  do  not  exploit  the  optimization  structure  of  the  problems.  For  illustration 
purposes,  this  report  includes  implementations  of  the  model  problems  in  COMSOL 
Multiphysics  (linked  together  with  MATLAB)1.  Since  our  implementations  use  little 
finite  element  functionality  that  is  specific  to  COMSOL  Multiphysics,  only  few  code 
pieces  have  to  be  changed  in  order  to  have  these  implementations  available  in  other 
finite  element  packages. 

Related  papers,  in  which  the  use  of  generic  discretization  toolkits  for  the  solution 
of  PDE-constrained  optimization  or  inverse  problems  is  discussed  are  [15,21,23].  Note 
that  the  papers  [21,23]  focus  on  optimal  control  problems  and,  differently  from  our 
approach,  the  authors  use  the  nonlinear  solvers  provided  by  COMSOL  Multiphysics 
to  solve  the  arising  optimality  systems.  For  inverse  problems,  which  often  involve 
significant  nonlinearities,  this  approach  is  often  not  an  option.  In  [15],  finite  difference- 
discretized  PDE-constrained  optimization  problems  are  presented  and  short  MATLAB 
implementations  for  an  elliptic,  a  parabolic,  and  a  hyperbolic  model  problem  are 
provided.  A  systematic  review  of  methods  for  optimization  problems  with  implicit 
constraints,  as  they  occur  in  inverse  or  optimization  problems  with  PDEs  can  be 
found  in  [16].  For  a  comprehensive  discussion  of  regularization  methods  for  inverse 
problems,  and  the  numerical  solution  of  inverse  problems  (which  do  not  necessarily 
involve  PDEs)  we  refer  the  reader  to  the  text  books  [9,13,24,26]. 

The  organization  of  this  paper  is  as  follows.  The  next  three  sections  present  model 
problems,  discuss  the  derivation  of  the  optimality  systems,  and  explain  different  solver 
approaches.  In  Appendix  A,  we  discuss  practical  computation  issues  and  give  snippets 
of  our  implementations.  Complete  code  listings  can  be  found  in  Appendix  B  and  can 
be  downloaded  from  the  authors’  websites. 

2.  Parameter  field  inversion  in  elliptic  problem.  We  consider  the  estima¬ 
tion  of  a  coefficient  in  an  elliptic  partial  differential  equation  as  a  first  model  problem. 
Depending  on  the  interpretation  of  the  unknowns  and  the  type  of  measurements,  this 
model  problem  arises,  for  instance,  in  inversion  for  groundwater  flow  or  heat  con¬ 
ductivity.  It  can  also  be  interpreted  as  finding  a  membrane  with  a  certain  spatially 
varying  stiffness.  Let  D  C  Mn,  n  G  {1,  2,  3}  be  an  open,  bounded  domain  and  consider 
the  following  problem: 

min  J(a)  :=  -  [  (u  —  ud)2  dx  +—  [  \\7a\2dx,  (2.1a) 

a  2  Jn  2  Jn 

where  u  is  the  solution  of 

—V  •  (aVu)  =  /  in  Q,  .  . 

1  a0  (2.1b) 

u  =  0  on  oft  2, 

and 


a  G  Uad  :=  {a  G  L°°(f2),  a  >  aQ  >  0}.  (2.1c) 


1Our  implementations  are  based  on  COMSOL  Multiphysics  v3.5a  (or  earlier).  In  the  most  recent 
versions,  v4.0  and  v4.1,  the  scripting  syntax  in  COMSOL  Multiphysics  (with  MATLAB)  has  been 
changed.  We  plan  to  adjust  the  implementations  of  our  model  problems  to  these  most  recent  versions 
of  COMSOL  Multiphysics. 
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Here,  Ud  denotes  (possibly  noisy)  data,  /  G  iL_1(fl)  a  given  force,  7  >  0  the  regular¬ 
ization  parameter,  and  ao  >  0  the  lower  bound  for  the  unknown  coefficient  function 
a.  In  the  sequel,  we  denote  the  L2-inner  product  by  (•,*),  z.e.,  for  scalar  functions 
14,  v  and  vector  functions  u,  v  defined  on  we  denote 

(u,v)  :=  /  u(x)v (x)  dx  and  (u,  v):=  /  u (x)-v(x)dx, 

Jn  Jn 

where  denotes  the  inner  product  between  vectors.  With  this  notation,  the  varia¬ 
tional  (or  weak)  form  of  the  state  equation  (2.1b)  is:  Find  u  G  Hq(Q)  such  that 

(aVu,  Vz)  —  (/,  z)  =  0  for  all  2  G  Hq  (11),  (2.2) 

where  Hq(Q)  is  the  space  of  functions  vanishing  on  dfl  with  square  integrable  deriva¬ 
tives.  It  is  well  known  that  for  every  a,  which  is  bounded  away  from  zero,  (2.2) 
admits  a  unique  solution,  u  (this  follows  from  the  Lax-Milgram  theorem  [7]).  Based 
on  this  result  it  can  be  shown  that  the  regularized  inverse  problem  (2.1)  admits  a 
solution  [13,26].  However,  this  solution  is  not  necessary  unique. 

2.1.  Optimality  system.  We  now  compute  the  first-order  optimality  conditions 
for  (2.1),  where,  for  simplicity  of  the  presentation,  we  neglect  the  bound  constraints 
on  a,  z.e.,  Uad  •=  L°°(Q).  We  use  the  (formal)  Lagrangian  approach  (see,  e.g.,  [14,25]) 
to  compute  the  optimality  conditions  that  must  be  satisfied  at  a  solution  of  (2.1).  For 
that  purpose  we  introduce  a  Lagrange  multiplier  function  p  to  enforce  the  elliptic 
partial  differential  equation  (2.1b)  (in  the  weak  form  (2.2)).  In  general,  the  function 
p  inherits  the  type  of  boundary  condition  as  u,  but  satisfies  homogeneous  conditions. 
In  this  case,  this  means  that  p  G  Hq(Q).  The  Lagrangian  functional  ££  :  L°°(Q)  x 
Him  x  H(\  (Q)  — >  E.  which  we  use  as  a  tool  to  derive  the  optimality  system,  is  given 
by  . . 

Sf(a,u,p)  :=  -ud,u-  ud)  +  |(Va,  Va)  +  (aVu,  Vp)  -  (, f,p ).  (2.3) 

Here,  a  and  u  are  considered  as  independent  variables.  The  Lagrange  multiplier 
theory  shows  that,  at  a  solution  of  (2.1)  variations  of  the  Lagrangian  functional  with 
respect  to  all  variables  must  vanish.  These  variations  of  ££  with  respect  to  (p,  u,  a) 
in  directions  (fi,p,  a)  are  given  by 

J ?p(a,u,p)(p)  =  (aVw,  Vp)  -  (/,p)  =  0,  (2.4a) 

J*fu(a,  u,p)(u)  =  (aVp,  Vfi)  +  (u  —  Ud ,  u)  =  0,  (2.4b) 

^fa(a,?x,p)(a)  =  y(Va,  Va)  +  (aS7u,  Vp)  =0,  (2.4c) 

where  the  variations  (fi,p,  a)  are  taken  from  the  same  spaces  as  (iqp,  a).  Note  that 

(2.4a)  is  the  weak  (or  variational)  form  of  the  state  equation  (2.1b).  Moreover,  assum¬ 
ing  that  the  solutions  are  sufficiently  regular,  (2.4b)  is  the  weak  form  of  the  adjoint 
equation 

—V  •  (aVp)  =  —{u  —  Ud)  in  fi,  (2.5a) 

p  =  0  on  dVt.  (2.5b) 

In  addition,  the  strong  form  of  the  control  equation  (2.4c)  is  given  by 

—V  •  (Va)  =  —Vu  •  Vp  in  Q. 

Va  •  n  =  0  on  <9fL 
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(2.6a) 

(2.6b) 


Note  that  the  optimality  conditions  (2.4)  (in  weak  form)  or  (2.1b),  (2.5)  and  (2.6) 
(in  strong  form)  form  a  system  of  PDEs.  This  system  is  nonlinear,  even  though  the 
state  equation  is  linear  (in  u).  To  find  the  solution  of  (2.1),  these  conditions  need  to 
be  solved.  We  now  summarize  common  approaches  to  solve  such  a  system  of  PDEs. 
Naturally,  PDE  systems  of  this  form  can  only  be  solved  numerically,  be.,  they  have 
to  be  discretized  using,  for  instance,  the  finite  element  method.  In  the  sequel,  we 
use  variational  forms  to  present  our  algorithms.  These  forms  can  be  interpreted  as 
continuous  (z.e.,  in  function  spaces)  or  finite-dimensional  as  they  arise  in  finite  element 
discretized  problems.  For  illustration  purposes  we  also  use  block- matrix  notation  for 
the  discretized  problems. 

2.2.  Steepest  descent  method.  We  start  with  describing  the  steepest  descent 
method  [19,26]  for  the  solution  of  (2.1).  This  method  uses  first-order  derivative  (z.e., 
gradient)  information  only  to  iteratively  minimize  (2.1a).  While  being  simple  and 
commonly  used,  it  cannot  be  recommended  for  most  inverse  problems  with  PDEs  due 
to  its  unfavorable  convergence  properties.  However,  we  briefly  discuss  this  approach 
for  completeness  of  the  presentation. 

The  steepest  descent  method  updates  the  parameter  field  a  using  the  gradient  g  := 
Va  J(a)  of  problem  (2.1).  It  follows  from  Lagrange  theory  (e.g.,  [25])  that  this  gradient 
is  given  by  the  left  hand  side  in  (2.4c),  provided  (2.4a)  and  (2.4b)  are  satisfied.  Thus, 
the  steepest  descent  method  for  the  solution  of  (2.4)  computes  iterates  (uk,Pk,Uk) 
( k  =  1,2,...)  as  follows:  Given  a  coefficient  field  a^,  the  gradient  gk  is  computed  by 
first  solving  the  state  problem 

{akVuk,  Vp)  -  (/,p)  =  0  for  all  p,  (2.7a) 

for  Uk .  With  this  Uk  given,  the  adjoint  equation 

(afeVpfe,  Vu)  +  (uk  —  Ud,  u)  =  0  for  all  u  (2.7b) 

is  solved  for  pk .  Finally,  the  gradient  gk  is  obtained  by  solving 

7(Vat,  Vg )  +  (gVuk, \7pk)  =  (gk,g)  for  all  g.  (2.7c) 

Since  the  negative  gradient  gk  is  a  descent  direction  for  the  cost  functional  J,  it  is 
used  to  update  the  coefficient  a^,  be., 


^k-\-l  • —  ^kQk'  (2.7d) 

Here,  is  an  appropriately  chosen  step  length  such  that  the  cost  functional  is  suffi¬ 
ciently  decreased.  Sufficient  descent  can  be  guaranteed,  for  instance,  by  choosing  ak 
that  satisfies  the  Armijo  or  Wolfe  condition  [22].  This  process  is  repeated  until  the 
norm  of  the  gradient  gk  is  sufficiently  small.  A  description  of  the  implementation  of 
the  steepest  descent  method  in  COMSOL  Multiphysics,  as  well  as  a  complete  code 
listing  can  be  found  in  Appendix  A.l  and  Appendix  B.l.  While  the  steepest  descent 
method  is  simple  and  commonly  used,  Newton-type  methods  are  often  preferred  due 
to  their  faster  convergence. 

2.3.  Newton  methods.  Next,  we  discuss  variants  of  the  Newton’s  method  for 
the  solution  of  the  optimality  system  (2.4).  The  Newton  method  requires  second-order 
variational  derivatives  of  the  Lagrangian  (2.3).  Written  in  abstract  form,  it  computes 
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an  update  direction  (ak,uk,pk)  from  the  following  Newton  step  for  the  Lagrangian 
functional: 

£?"(ak,uk,pk)  [(ak,Uk,Pk),  (a,  u,p)]  =  -£?' (ak,uk,pk)(a,u,p),  (2.8) 

for  all  variations  (a,  iqp),  where  ££'  and  S£n  denote  the  first  and  second  variations  of 
the  Lagrangian  (2.3).  For  the  elliptic  parameter  inversion  problem  (2.1),  this  Newton 
step  (written  in  variatonal  form)  is  as  follows:  Find  ( uk,ak,pk )  as  the  solution  of  the 
linear  system 

(uk ,u)  +(ak\7pk,  Vu)  +(akVu,  Vpk)  =  (ud  -  uk,u)  -  (ak Vpk,  Vu) 

(aVuk,  Wpk)  +7 (Vd/e,  Va)  +(aVuk,  \7pk)  =  -7(Vafc,  Va)  -  ( aS7uk ,  Vpk) 

(ak\7uk,  Vp)  -h(akVuk,  Vp)  =  ~(akVuk,  Vp)  Hb  (f,p), 

(2.9) 

for  all  ( u,a,p ).  To  illustrate  features  of  the  Newton  method,  we  use  the  matrix 
notation  for  the  discretized  Newton  step  and  denote  the  vectors  corresponding  to  the 
discretization  of  the  functions  ak,uk,pk  by  ak,iik  and  pk.  Then,  the  discretization 
of  (2.9)  is  given  by  the  following  symmetric  linear  system 

"Wuu  W  ua  All"  ZLk  1  I"  Qu  1 

Wau  R  CT  ak  =  -  ga  ,  (2.10) 

A  c  0  J  L  Pk  J  [  9p  _ 

where  Wuu,  Wua,  Wau,  and  R  are  the  components  of  the  Hessian  matrix  of  the 
Lagrangian,  A  and  C  are  the  Jacobian  of  the  state  equation  with  respect  to  the  state 
and  the  control  variables,  respectively  and  gu ,  ga ,  and  gp  are  the  discrete  gradients 
of  the  Lagrangian  with  respect  to  n,  a  and  p,  respectively. 

Systems  of  the  form  (2.10),  which  commonly  arise  in  constrained  optimization 
problems  are  called  Karush-Kuhn- Tucker  (KKT)  systems.  These  systems  are  usually 
indefinite,  z.e.,  they  have  negative  and  positive  eigenvalues.  In  many  applications, 
the  KKT  systems  can  be  very  large.  Thus,  solving  them  with  direct  solvers  is  often 
not  an  option,  and  iterative  solvers  must  be  used;  we  refer  to  [2]  for  an  overview  of 
iterative  methods  for  KKT  systems. 

To  relate  the  Newton  step  on  the  first-order  optimality  system  to  the  underlying 
optimization  problem  (2.1),  we  use  a  block  elimination  in  (2.10).  Also,  we  assume 

that  uk  and  pk  satisfy  the  state  and  the  adjoint  equations  such  that  gu  =  gp  =  0.  To 

eliminate  the  incremental  state  and  adjoint  variables,  uk  and  pkl  from  the  first  and 
last  equations  in  (2.10)  we  use 

uk  =  — A-1C  &fc,  (2.11a) 

Pk  =  -A_r(Wuu'Ufe  +  Wuaa/e).  (2.11b) 

This  results  in  the  following  reduced  linear  system  for  the  Newton  step 

H  ak  =  ga .  (2.12a) 

with  the  reduced  Hessian  H  given  by 

H  :=  R  +  CrA-r(WuuA-1C  -  Wua)  -  WanA  lC.  (2.12b) 

This  reduced  Hessian  involves  the  inverse  of  the  state  and  adjoint  operators.  This 
makes  it  a  dense  matrix  that  is  often  too  large  to  be  computed  (and  stored).  However, 
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the  reduced  Hessian  matrix  can  be  applied  to  vectors  by  solving  linear  systems  with 
the  matrices  A  and  AT.  This  allows  to  solve  the  reduced  Hessian  system  (2.12a)  using 
iterative  methods  such  as  the  conjugate  gradient  method.  Once  the  descent  direction 
&£  is  computed,  the  next  step  is  to  apply  a  line  search  for  finding  an  appropriate 
step  size,  <a,  as  described  in  Section  2.2.  Note  that  each  backtracking  step  in  the  line 
search  involves  the  evaluation  of  the  cost  functional,  which  amounts  to  the  solution 
of  the  state  equation  with  a  trial  coefficient  field  a^+1. 

The  Newton  direction  dk  is  a  descent  direction  for  (2.1)  only  if  the  reduced  Hessian 
(or  an  approximation  H  of  the  reduced  Hessian)  is  positive  definite.  While  H  is 
positive  in  a  neighborhood  of  the  solution,  it  can  be  indefinite  or  singular  away  from 
the  solution,  and  dk  is  not  guaranteed  to  be  a  descent  direction.  There  are  several 
possibilities  to  overcome  this  problem.  A  simple  remedy  is  to  neglect  the  terms 
involving  Wua  and  Wau  in  (2.12b),  which  leads  to  the  Gauss-Newton  approximation 
of  the  Hessian,  which  is  always  positive  definite.  The  resulting  direction  dk  is  always 
a  descent  direction,  but  the  fast  local  convergence  of  Newton’s  method  can  be  lost 
when  neglecting  the  blocks  Wua  and  Wau  in  the  Hessian  matrix.  A  more  sophisticated 
method  to  ensure  the  positive  definiteness  of  an  approximate  Hessian  is  to  terminate 
the  conjugate  gradient  method  for  (2.12a)  when  a  negative  curvature  direction  is 
detected  [12].  This  approach,  which  is  not  followed  here  for  simplicity,  guarantees  a 
descent  direction  while  maintaining  the  fast  Newton  convergence  close  to  the  solution. 

2.4.  Gauss-Newton-CG  method.  To  guarantee  a  descent  direction  dk  in 
(2.12a),  the  Gauss-Newton  method  uses  the  approximate  reduced  Hessian 

H  =  R+CtA“tWuuA_1C.  (2.13) 

Compared  to  (2.12b),  using  the  inexact  reduced  Hessian  (2.13)  also  has  the  advantage 
that  the  matrix  blocks  Wua  and  Wau  do  not  need  to  be  assembled.  Note  that  Wua 
and  Wau  are  proportional  to  the  adjoint  variable.  If  the  measurements  are  attained  at 
the  solution  (z.e.,  u  =  Ud),  the  adjoint  variable  is  zero  and  thus  one  obtains  fast  local 
convergence  even  when  these  blocks  are  being  neglected.  In  general,  particularly  in 
the  presence  of  noise,  measurements  are  not  attained  exactly  at  the  solution  and  the 
fast  local  convergence  property  of  Newton’s  method  is  lost.  However,  often  adjoint 
variables  are  small  and  the  Gauss-Newton  Hessian  is  a  reasonable  approximation  for 
the  full  reduced  Hessian. 

The  Gauss-Newton  method  can  alternatively  be  interpreted  as  an  iterative  method 
that  computes  a  search  direction  based  on  an  auxiliary  problem,  which  is  given  by 
a  quadratic  approximation  of  the  cost  functional  and  a  linearization  of  the  PDE 
constraint  [22].  As  for  the  Newton  method,  also  for  the  Gauss-Newton  method  the 
optimal  step  length  in  a  neighborhood  of  the  solution  is  a  =  1.  This  property  of 
Newton-type  methods  is  a  significant  advantage  compared  to  the  steepest  descent 
method,  where  no  prior  information  on  a  good  step  length  is  available. 

2.5.  Bound  constraints  via  the  logarithmic  barrier  method.  In  the  pre¬ 
vious  sections  we  have  neglected  the  bound  constraints  on  the  coefficient  function  a 
in  the  inverse  problems  (2.1).  However,  in  practical  applications  one  often  has  to  (or 
would  like  to)  impose  bound  constraints  on  inversion  parameters.  This  is  usually  due 
to  a  priori  available  physical  knowledge  about  the  parameters  that  are  reconstructed 
in  the  inverse  problem.  In  (2.1),  for  instance,  a  has  to  be  bounded  away  from  zero 
for  physical  reasons  and  to  maintain  the  ellipticity  (and  unique  solvability)  of  the 
state  equation.  Another  example  in  which  the  result  of  the  inversion  can  benefit  from 
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imposing  bounds  is  the  problem  in  Section  3,  where  we  invert  for  a  concentration, 
which  cannot  be  negative. 

We  now  extend  Newton’s  method  to  incorporate  bounds  of  the  form  (2.1c). 
The  approach  used  here  is  a  very  simplistic  one,  namely  the  logarithmic  barrier 
method  [22].  We  add  a  logarithmic  barrier  with  barrier  parameter  /i  to  the  dis¬ 
cretization  of  the  optimization  problem  (2.1)  to  enforce  a  —  aQ  >  0,  be.,  aQ  is  the 
lower  bound  for  the  coefficient  function  a.  Then,  the  discretized  form  of  the  Newton 
step  on  the  optimality  conditions  is  given  by  the  following  linear  system 


"Wuu  Wua 

wau  R  +  Z 
A  C 


A  Ufa 

CT  ak 
0  pk 


a 

Q*k  CLo 


(2.14) 


where  the  same  notations  as  in  (2.10)  are  used.  The  terms  due  to  the  logarithmic 
barrier  impose  the  bound  constraints  at  nodal  points  and  only  appear  in  the  control 
equation.  The  matrix  Z  is  diagonal  with  components  (afc_^Q  y  •  Note  that  both,  Z 
and  the  right  hand  side  term — —  become  large  at  points  where  is  close  to  the 
bound  a0,  which  can  lead  to  ill-conditioning. 

Neglecting  the  terms  Wua  and  Wau  we  obtain  a  Gauss-Newton  method  for  the 
logarithmic  barrier  problem,  similarly  as  demonstrated  in  Section  2.4.  Once  the  New¬ 
ton  increment  a&  has  been  computed,  a  line  search  for  the  control  variable  update  is 
applied.  To  assure  that  o^+i  —  aQ  >  0  the  choice  for  the  initial  step  length  is  [22]: 


a  =  mini  1,  min 

y  oo)i<C0 


{CLk  & o)i 
(&k  ^o)i 


(2.15) 


It  can  be  challenging  to  choose  the  barrier  parameter  //  appropriately.  One  would  like 
/i  to  be  small  to  keep  the  influence  of  the  barrier  function  small  in  the  inner  of  the  fea¬ 
sible  set  Uad .  However,  this  can  lead  to  small  step  lengths  and  severe  ill-conditioning 
of  the  Newton  system,  which  has  led  to  the  development  of  methods  in  which  a  series 
of  logarithmic  barrier  problems  are  solved  for  a  decreasing  sequence  of  barrier  pa¬ 
rameters  fi.  For  more  sophisticated  approaches  to  deal  with  bound  constraints,  such 
as  the  (primal-dual)  interior  point  method,  or  (primal-dual)  active  set  methods,  we 
refer  the  reader  to  [3,6,22,27].  An  alternative  way  to  impose  bound  constraints  in 
the  optimization  problem  is  choosing  a  parametrization  of  the  parameter  field  that 
already  incorporates  the  constraint.  For  example,  if  ao  =  0  one  can  parametrize  the 
coefficient  field  as  a  =  exp (6),  and  invert  for  b.  Thus,  a  satisfies  the  non- negativity 
constraint  by  construction.  This  approach  comes  at  the  price  of  adding  additional 
nonlinearity  to  the  optimality  system. 

2.6.  Numerical  tests.  In  this  section,  we  show  results  obtained  with  the  steep¬ 
est  descent  and  the  Gauss-Newton-CG  methods  as  described  in  Sections  2.2  and  2.4. 
In  particular,  compare  the  number  of  iterations  needed  by  these  methods  to  achieve 
convergence  for  a  particular  tolerance.  In  Fig.  2.1,  we  show  the  “true”  coefficient 
(left),  which  is  used  to  synthesize  measurement  data  by  solving  the  state  equation  (a 
similar  test  example  is  used  in  [3]).  We  add  noise  to  this  synthetic  data  (see  Fig.  2.1, 
center)  to  lessen  the  “inverse  crime”  [18],  which  occurs  when  the  same  numerical 
method  is  used  for  the  synthetization  of  the  data  and  for  the  inversion.  An  additional 
strategy  to  avoid  inverse  crimes  would  be  solving  the  state  equation  to  synthesize 
measurement  data  using  a  different  discretization  or,  at  least,  a  finer  mesh.  The  re¬ 
covered  coefficient,  z.e.,  the  solution  of  the  inverse  problem  is  shown  on  the  right  in 
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Fig.  2.1.  Results  for  the  elliptic  coefficient  field  inversion  model  problem  (2.1)  with  7  =  10-9 
and  1%  noise  in  the  synthesized  data.  “True”  coefficient  field  a  (left),  noisy  data  u d  (center),  and 
recovered  coefficient  field  a  (right). 


Figure  2.1.  Note  that  while  the  “true”  coefficient  is  discontinuous,  the  reconstruction 
is  continuous.  This  is  a  result  of  the  regularization,  which  does  not  allow  for  discon¬ 
tinuous  fields  (the  regularization  term  would  be  infinite  if  discontinuities  were  present 
in  the  reconstruction).  A  more  appropriate  regularization  that  allows  discontinuous 
reconstructions  is  the  total  variation  regularization,  see  Section  5. 

Table  2.1  compares  the  number  of  iterations  needed  by  the  steepest  descent  and 
the  Gauss-Newton-CG  method.  As  can  be  seen,  for  all  four  meshes  the  number  of  iter¬ 
ations  needed  by  the  Gauss-Newton  method  is  significantly  smaller  than  the  number 
of  iterations  needed  by  the  steepest  descent  method.  We  note  that  while  comput¬ 
ing  the  steepest  descent  direction  requires  only  the  solution  of  one  state,  one  adjoint 
and  one  control  equation  at  each  iteration,  the  Gauss-Newton  method  additionally 
requires  the  solution  of  the  incremental  equations  for  the  state  and  adjoint  at  each 
CG  iteration.  Since  an  accurate  solution  of  the  Gauss-Newton  system  is  only  needed 
close  to  the  solution,  we  use  an  inexact  CG  method  to  reduce  the  number  of  CG 
iterations,  and  hence  the  number  of  state-adjoint  solves.  For  this  purpose,  we  adjust 
the  CG-tolerance  by  relating  it  to  the  norm  of  the  gradient,  z.e., 

tol  =  min  (o.5,  ySS)  IlflSlI, 

where  g®  and  g\  are  the  initial  gradient  and  the  gradient  at  iteration  fc,  respectively. 
This  choice  of  the  tolerance  leads  to  an  early  termination  of  the  CG  iteration  away 
from  the  solution  and  enforces  a  more  accurate  solution  of  the  Newton  system  as 
the  gradient  becomes  small  (z.e.,  close  to  the  solution).  While  compared  to  a  more 
accurate  computation  of  the  Newton  direction,  this  inexactness  can  result  in  a  larger 
number  of  Newton  iterations,  but  reduces  the  overall  number  of  CG  iterations  signif¬ 
icantly. 

With  the  early  termination  of  CG,  the  Gauss-Newton  method  requires  signifi¬ 
cantly  less  number  of  forward- adjoint  solves  than  does  the  steepest  descent  method, 
as  can  be  seen  in  Table  2.1.  For  example,  on  a  mesh  of  40  x  40  elements,  the  Gauss- 
Newton  method  takes  11  outer  (z.e.,  Gauss-Newton)  iterations  and  overall  27  inner 
(z.e.,  CG)  iterations,  which  amounts  to  39  forward- adjoint  solves  overall.  The  steepest 
descent  method,  on  the  other  hand,  requires  267  state-adjoint  solves.  This  perfor¬ 
mance  difference  becomes  more  evident  on  finer  meshes:  While  for  the  Gauss-Newton 
method  the  iteration  numbers  remain  almost  constant,  the  steepest  descent  method 
requires  significantly  more  iterations  as  the  mesh  is  refined  (see  Table  2.1). 
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Mesh 

Steepest  descent 
#iter 

Gauss-Newton  (CG) 
#iter 

10  x  10 

68 

10  (30) 

20  x  20 

97 

10  (22) 

40  x  40 

267 

11  (27) 

80  x  80 

>1000 

12  (31) 

Table  2.1 


Number  of  iterations  for  the  steepest  descent  and  the  Gauss-Newton  methods  for  7  =  10-9  and 
1%  noise  in  the  synthetic  data.  Both  iterations  were  terminated  when  the  L2  -norm  of  the  gradient 
dropped  below  10-8;  or  the  maximum  number  of  iterations  was  reached. 


3.  Initial  condition  inversion  in  advective-diffusive  transport.  We  con¬ 
sider  a  time-dependent  advection-diffusion  equation,  in  which  we  invert  for  an  un¬ 
known  initial  condition.  The  problem  can  be  interpreted  as  finding  the  initial  distri¬ 
bution  of  a  contaminant  from  measurements  taken  after  the  contaminant  has  been 
subjected  to  diffusive  transport  [1].  Let  ft  C  Mn  be  open  and  bounded  (we  choose 
n  —  2  in  the  sequel)  and  consider  measurements  on  a  part  Tm  C  3ft  of  the  boundary 
over  the  time  horizon  [Xi,T],  with  0  <  T\  <  T.  The  inverse  problem  is  formulated  as 
follows: 

min  J(uo)  :=  -  [  [  (u  —  ug)2  dx  dt  +  —  [  u^dx^-—  [  \\7uo\2  dx  (3.1a) 

n°  2  Jtx  JVm  2  Jn  2  Jn 

where  u  is  the  solution  of 


ut  —  ftA u  +  v  •  Vu  =  0 

in  ft  x  [0,  T], 

(3.1b) 

?i(0,  x)  -  uo 

in  ft, 

(3.1c) 

kS7u  •  n  =  0 

on  3ft  x  [0, T]. 

(3.1d) 

Here,  Ud  denotes  measurements  on  1^,  71  and  72  are  regularization  parameters  cor¬ 
responding  to  L2-  and  H1 -regularizations,  respectively,  and  k  >  0  is  the  diffusion 
coefficient.  The  boundary  3ft  is  split  into  disjoint  parts  T/,  Tr,  T  and  Tm  as  shown 
in  Figure  3.1  (left).  The  velocity  field,  v,  is  computed  by  solving  the  steady  Navier- 
Stokes  equation  with  the  side  walls  driving  the  flow  (see  the  sketch  on  the  right  in 
Fig.  3.1): 


in  ft,  (3.2a) 

in  ft,  (3.2b) 

on  3ft,  (3.2c) 

where  q  is  pressure,  Re  is  the  Reynolds  number,  and  g  =  (gi,#2)T  =  0  on  3ft  but 

=  1  on  Y\  and  ^2  =  -1  on  Tr. 

The  inverse  problem  (3.1)  has  several  different  properties  compared  to  the  elliptic 
parameter  estimation  problem  (2.1).  First,  the  state  equation  is  time-dependent; 
second,  the  inversion  is  based  on  boundary  data  only;  third,  the  inversion  is  for  an 
initial  condition  rather  than  a  coefficient  field;  forth,  both  L2 -regularization  and  H1- 
regularization  for  the  initial  condition  can  be  used;  and,  fifth,  the  adjoint  operator 
(i.e.,  the  operator  in  the  adjoint  equation)  is  different  from  the  operator  in  the  state 


Av  +  V<7  +  v  •  Vv  =  0 
Re 

V- v  =  0 

v  =  g 
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Fig.  3.1.  Left:  Sketch  of  domain  for  the  advective- diffusive  inverse  transport  problem  (3.1). 
Right:  The  velocity  field  v  computed  from  the  solution  of  the  Navier-Stokes  equation  (3.2)  with 
Re  =  100. 


equation  (3.1b)  since  the  advection  operator  is  not  self-adjoint.  To  compute  the 
optimality  system,  we  use  the  Lagrangian  function 


Jt?(u,uo,p,po)  :=  J(uq)  +  /  /  (ut  +  v  •  Vu)pdxdt 

J  o  Jn 

+  /  /  kVu  •  Vpdx  dt  +  /  (u(0)  —  uo)po  dx. 

Jo  Jq  Jq 


The  optimality  conditions  for  (3.1)  are  obtained  by  setting  variations  of  the  La¬ 
grangian  with  respect  to  all  variables  to  zero.  Variations  with  respect  to  p  and  po 
reproduce  the  state  equation  (3.1b)-(3.1d).  The  variation  with  respect  to  u  in  a 
direction  ft  is 


(it,  uo,p,po)(u)  =  /  /  (u  —  Ud)u  dx  dt  - b  /  /  (fit  +  v  •  Vu)pdx  dt 

J Ti  Jvm  Jo  Jn 


'T1 

rT 


f  f  ftVft  •  Vpdx  dt  +  f  u(0)po 

Jo  Jn  Jn 


dx. 


Partial  integration  in  time  for  the  term  utp  and  in  space  for  (v  •  Vft)p  =  Vft  •  (vp)  and 
ft  Vft  •  Vp  results  in 


(iq  ito,p,Po)(fO  =  /  /  (n  —  iq^ft  dx  dt  +  /  /  (— pt  —  V  •  (vp)  —  ftAp)ftdxd£ 

J T\  Jrrn  Jo  Jn 

+  f  u(T)p(T)  —  ft(0)p(0)  +  ft(0)po  dx  +  f  f  (vp  +  ft  Vp)  •  nw  dx  dt. 
Jn  Jo  Jan 


Since  at  a  stationary  point  the  variation  vanishes  for  arbitrary  ft,  we  obtain  po  =  p(0), 
as  well  as  the  following  strong  form  of  the  adjoint  system  (where  we  assume  that  the 
variables  are  sufficiently  regular  for  integration  by  parts) 


Pt  —  V  •  (pv)  —  kAp  =  0 

in  x  [0,  T], 

(3.3a) 

p{T)  =  0 

in  f2, 

(3.3b) 

(vp  +  KVp)  •  n  =  -(u  -  ud) 

on  rm  x  [Ti,T], 

(3.3c) 

(vp  +  /tVp)  •  n  =  0 

on  (90  x  [0,T])  \  (rm  x  [Ti,T]). 

(3.3d) 
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Note  that  (3.3)  is  a  final  value  problem,  since  p  is  given  at  t  =  T  rather  than  at 
t  =  0.  Thus,  (3.3)  has  to  be  solved  backwards  in  time,  which  amounts  to  the  solution 
of  an  advection-diffusion  equation  with  velocity  — v.  Finally,  the  variation  of  the 
Lagrangian  with  respect  to  the  initial  condition  u 0  in  direction  u0  is 

^u0{u,Uo,P,Po){uo)  =  /  "nuouo  +72V^0  •  Vuo  -poUodx. 

Jn 

This  variation  vanishes  for  all  uq,  if 


V  •  (72 Vix0)  +  71^0  -  Po  m  V  •  (72  Viz0)  +  71^0  -  p( 0)  =  0  (3.4) 

holds,  combined  with  homogeneous  Neumann  conditions  for  uo  on  all  boundaries. 
Note  that  the  optimality  system  (3.1b)-(3.1d),  (3.3)  and  (3.4)  is  affine  and  thus  a 
single  Newton  iteration  is  sufficient  for  its  solution.  The  system  can  be  solved  using 
a  conjugate  gradient  method  for  the  unknown  initial  condition.  The  discretization 
of  this  initial  value  problem  is  discussed  next.  Its  implementation  is  summarized  in 
Appendix  A. 3  and  listed  in  Appendix  B.3. 

3.1.  Discretization.  To  highlight  some  aspects  of  the  discretization,  we  intro¬ 
duce  the  linear  solution  and  measurement  operators  S  and  Q.  For  a  given  initial 
condition  uo  we  denote  the  solution  (in  space  and  time)  of  (3.1)  by  Suo.  The  mea¬ 
surement  operator  Q  is  the  trace  on  Tm  x  [Xi,  T],  such  that  the  measurement  data  for 
an  initial  condition  uq  can  be  written  as  With  this  notation,  the  adjoint  state 

variable  at  time  t  =  0  becomes  p(0)  =  S*  Q*(  —  (QSu 0  —  Ud)),  where  denotes  the 
adjoint  operator.  Using  this  equation  in  the  control  equation,  (3.4)  results  in 

S^Q^QSuo  +  71^0  +  V  •  (72V^0)  =  S*Qkud-  (3.5) 

Note  that  the  operator  on  the  left  hand  side  in  (3.5)  is  symmetric,  and  it  is  positive 
definite  if  71  >  0  or  72  >  0. 

We  use  the  finite  element  method  for  the  spatial  discretization  and  the  implicit 
Euler  scheme  for  the  discretization  in  time.  To  ensure  that  the  discretization  of 
the  matrix  corresponding  to  the  linear  operator  on  the  left  hand  side  in  (3.5)  is 
symmetric,  we  discretize  the  optimization  problem  and  compute  the  corresponding 
discrete  optimality  conditions,  which  is  often  referred  to  as  discretize-then-optimize. 
This  approach  is  sketched  next.  The  discretized  cost  function  is 

min  J(«0)  :=  h«  -  Ud)TQ(u  ~  ud)  +  ^-UqMu0  +  ^-UqKuo,  (3.6) 
u0  Z  Z  2 

where  u  =  [u°  u1 . . .  uN]T  corresponds  to  the  space-time  discretization  of  u  (N  is  the 
number  of  time  steps  and  ul  are  the  spatial  degrees  of  freedom  at  the  i-th  time  step), 
which  satisfies  the  discrete  forward  problem  S u  =  /.  Here,  Ud  are  the  discrete  space- 
time  measurement  data,  Uq  is  the  initial  condition,  and  Q  is  the  discretized  (space- 
time)  measurement  operator,  be.,  Q  is  a  block  diagonal  matrix  with  AtQ  (At  denotes 
the  time  step  size  and  Q  the  discrete  trace  operator)  as  diagonal  block  for  time  steps 
in  [Ti,T],  and  zero  blocks  else.  Moreover,  M  and  R  are  the  matrices  corresponding 
to  the  integration  scheme  used  for  the  regularization  terms,  and  /  =  [Muq  0 ...  0]T, 


ll 


where  M  is  the  mass  matrix.  The  discrete  forward  operator,  S,  is 
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where  L  :=  M  +  AtN,  with  N  being  the  discretization  of  the  advection-diffusion 
operator.  The  discrete  Lagrangian  for  (3.6)  is 


C(u,u0,p)  :=  J(m0)  +pT(Su  -  /), 

where  p  =  \p°  ...pN]T  is  the  discrete  (space-time)  Lagrange  multiplier.  Thus,  the 
discrete  adjoint  equation  is 


Cu(u,u0,p)  =  STp  +  Q(u  -  ud)=  0,  (3.8) 

and  the  discrete  control  equation  is 

CUo(u,u0,p)  =  7iMm0  +72R/U0  -  Mp°  =  0.  (3.9) 

Since  (3.8)  involves  the  block  matrix  S  ,  the  discrete  adjoint  equation  is  a  backwards- 
in-time  implicit  Euler  discretization  of  its  continuous  counterpart.  From  the  last  row 
in  (3.8)  we  obtain 


L  pN  =  —AtQ(uN  —  Uj)  (3.10) 

as  the  discretization  of  the  homogeneous  terminal  conditions  (3.3).  Discretizing  (3.8) 
simply  by  pN  =  0  rather  than  (3.10)  does  not  result  in  a  symmetric  discretization 
for  the  left  hand  side  of  (3.5).  Such  an  inconsistent  discretization  means  that  the 
conjugate  gradient  method  will  likely  not  converge,  which  shows  the  importance  of 
a  discretization  that  has  an  underlying  discrete  optimization  problem,  as  guaranteed 
in  a  discretize-then-optimize  approach.  Note  that  as  At  —>  0,  the  discrete  condition 
(3.10)  tends  to  its  continuous  counterpart.  The  system  (3.9)  (with  p°  computed  by 
solving  the  state  and  adjoint  eqations)  is  solved  using  the  conjugate  gradient  method 
for  the  unknown  initial  condition. 

3.2.  Numerical  tests.  Next,  we  present  numerical  tests  for  the  initial  condi¬ 
tion  inversion  problem  (3.1),  which  is  solved  with  the  conjugate  gradient  method.  To 
illustrate  properties  of  the  forward  problem,  Figure  3.2  shows  three  time  instances  of 
the  field  u ,  using  the  advective  velocity  v  from  Figure  3.1.  Note  that  the  diffusion 
quickly  blurs  the  initial  condition.  Since  the  discretization  uses  standard  finite  ele¬ 
ments,  a  certain  amount  of  physical  diffusion  (be.,  k  cannot  be  too  small)  is  necessary 
for  numerical  stability  of  the  advection-diffusion  equation.  For  advection-dominated 
flows,  a  stabilized  finite  element  methods  (such  as  SUPG;  see  [8])  has  to  be  used. 

The  solutions  of  the  inverse  problem  for  various  choices  of  regularization  param¬ 
eters  are  shown  in  Figure  3.3.  In  the  left  and  middle  plots,  we  show  the  recovered 
initial  condition  with  L2-type  regularization,  while  the  right  plot  shows  the  inversion 
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Fig.  3.3.  The  recovered  initial  condition  uq  for  M  —  I,  71  =10  5  and  72  =  0  (left),  for 
M=  M,  71  =  10  2  and  72  =  0  (center),  and  for  71  =  0;  72  =  10  6  (right).  Other  parameters  are 
k  =  0.001,  T\  =  1,  and  T  —  4. 


results  obtained  with  H1  -regularization.  For  these  examples,  quadratic  finite  ele¬ 
ments  in  space  with  3889  degrees  of  freedom,  and  20  implicit  time  steps  for  the  time 
discretization  were  used,  z.e.,  the  state  is  discretized  with  overall  77780  degrees  of  free¬ 
dom.  The  CG  iterations  are  terminated  when  the  relative  residual  drops  below  10-4, 
which  requires  32  iterations  for  the  regularization  with  the  identity  matrix  (left  plot 
in  Figure  3.3),  43  iterations  for  the  L2-regularization  with  the  mass  matrix  (middle 
plot)  and  157  iterations  for  the  H1  -regularization  (right  plot).  Figure  3.3  shows  that 
the  L2-type  regularization  with  the  identity  matrix  allows  spurious  oscillations  in  the 
reconstruction,  since  these  high-frequency  components  correspond  to  very  small  (or 
zero)  eigenvectors  of  the  misfit  Hessian  as  well  as  the  regularization.  The  smoothing 
effect  of  the  771 -regularization  prevents  this  behavior  and  leads  to  a  much  improved 
reconstruction.  Since  the  measurements  are  restricted  to  Tm,  they  do  not  provide 
information  on  the  initial  condition  in  the  upper  left  part  of  the  domain.  Thus,  in 
that  part  the  reconstruction  of  the  initial  condition  is  controlled  by  the  regularization 
only,  which  explains  the  significant  differences  for  the  different  regularizations.  Fi¬ 
nally,  note  that  the  reconstructed  initial  concentrations  also  contain  negative  values. 
This  unphysical  behavior  could  be  avoided  by  enforcing  the  bound  constraint  uq  >  0 
in  the  inversion  procedure. 

4.  Source  terms  inversion  in  a  nonlinear  elliptic  problem.  As  our  last 
model  problem  we  consider  the  estimation  of  the  volume  and  boundary  source  in  a 
nonlinear  elliptic  equation.  We  assume  a  situation  where  only  point  measurements 
are  available,  which  results  in  a  problem  formulated  as 

min  J{f,g)~\  [  {u  -  ud)2b(x)  dx  +  ^  [  |V/|2cte+^  [  g2  dx,  (4.1a) 
f>9  4  JQ  Z  JQ  Z  JT 
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where  u  is  the  solution  of 

—A u  +  u  +  cu3  =  f  in  f2,  (4.1b) 

Vu  •  n  =  g  on  T,  (4.1c) 

where  Q  C  Mn,  n  G  {1,2,3}  is  an  open,  bounded  domain  with  boundary  T  :=  dfl. 
The  constant  c  >  0  controls  the  amount  of  nonlinearity  in  the  state  equation  (4.1b) 
and  b(x)  denotes  the  point  measurement  operator  defined  by 

Nr 

b(x)  =  ^2  b(x  ~  xj )  f°r  3  “  1?  •••>  (4.2) 

3  =  1 

where  Nr  denotes  the  number  of  point  measurements,  and  S(x  —  Xj )  is  the  Dirac  delta 
function.  Moreover,  /  G  H 1(fl)  and  g  G  L2(T)  are  the  source  terms,  71  >  0  and 
72  >  0  the  regularization  parameters,  and  n  is  the  outwards  normal  for  T.  Note  that 
while  the  notation  in  (4.1)  suggests  that  Ud  is  a  function  given  on  all  of  fi,  due  to  the 
definition  of  b  only  the  point  data  Ud(xj)  are  needed.  We  assume  that  the  domain  Q  is 
sufficiently  smooth,  which  implies  that  the  solution  of  (4.1b)  and  (4.1c)  is  sufficiently 
regular  for  the  point  evaluation  to  be  well  defined. 

Problem  (4.1)  (we  refer  to  [9]  for  a  similar  problem)  is  used  to  demonstrate  features 
of  inverse  problems  that  are  not  present  in  the  elliptic  parameter  estimation  problem 
(Section  2)  or  the  initial-time  inversion  (Section  3).  Namely,  the  state  equation  is 
nonlinear,  we  invert  for  sources  rather  than  for  a  coefficient,  and  the  inversion  is  for 
two  fields,  for  which  different  regularizations  are  used.  Moreover,  the  inversion  is 
based  on  discrete  point  rather  than  distributed  or  boundary  measurements. 

The  computation  of  the  optimality  system  for  (4.1)  is  based  on  the  Lagrangian 
functional 

u,f,g,P )  :=  J(u,f,9 )  +  (Vw,  Vp)  +  ( u  +  cu 3  -  f,p)  -  (g,p) r, 

where  (• ,  -)r  denotes  the  L2-product  over  the  boundary  T,  and,  as  before,  (• ,  •)  is  the 
L2-product  over  Q.  We  compute  the  variations  of  the  Lagrangian  with  respect  to  all 
variables  and  set  them  to  zero  to  derive  the  (necessary)  optimality  conditions.  This 
results  in  the  weak  form  of  the  first-order  optimality  system: 


0  =  3fp(u,  f,g,p)(p)  =  (Vu,  Vp)  +  ( u  +  cu 3  - 

f,p)~  (g,p) r, 

(4.3a) 

0  =  3fu(u,f,g,p)(u)  =  (Vp,  V«)  +  ((1  +  3 cu2)p,u)  +  ((u  -  ud)b(x),u), 

(4.3b) 

0  =  Jz?f(u,f,g,p)(f )  =  7i  (V/,  V/)  - 

(p,f), 

(4.3c) 

0  =  -2,g(u,f,g,p)(g)  =  {i2g-p,g)r, 

(4.3d) 

for  all  variations  (ft,/,^,p).  Invoking  Green’s  (first)  identity  where  needed, 

and  rear- 

ranging  terms,  the  strong  form  for  this  system  is 

as  follows: 

—A  u  +  u  +  cu 3  =  / 

in  fl, 

(state  equation) 

(4.4a) 

Vu  •  n  —  g 

on  T, 

—A  p  +  (1  +  3  cu2)p  =  (ud  —  u)b(x) 

in  D, 

(adjoint  equation) 

(4.4b) 

$ 

3 

II 

o 

on  T, 

-V  •  (71V/)  =p 

in  fi, 

(/-gradient) 

(4.4c) 

<1 

S 

II 

o 

on  T, 

72  g  -  p  =  o 

in  T. 

(^-gradient) 

(4.4d) 
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We  solve  the  optimality  system  (4.3)  (or  equivalently  (4.4))  using  the  Gauss- 
Newton  method  described  in  Section  2.3.  After  discretizing  and  taking  variations 
of  (4.3a)-(4.3d)  with  respect  to  (u,f,g,p)  we  obtain  the  Gauss-Newton  step  (where 
we  assume  that  the  state  and  adjoint  equations  have  been  solved  exactly  and  thus 
9u  9p  0) 


BOO  At' 

"  Uk  ' 

'  0 

0  Ri  0  Cf 

fk 

9f 

0  0  r2  cl 

9k 

9g 

> 

O 

O 

to 

0 

.  Pk  . 

0 

Here,  B  is  the  matrix  corresponding  to  the  point  measurements,  Ri  and  R2  are 
stiffness  and  boundary  mass  matrices  corresponding  to  the  regularization  for  /  and  g , 
respectively,  and  A,  Ci  and  C2  are  the  Jacobians  of  the  state  equation  with  respect  to 
the  state  variables,  and  of  the  adjoint  equation  with  respect  to  both  control  variables, 
respectively.  With  and  gg  we  denote  the  discrete  gradients  for  /  and  g ,  respectively. 

Finally,  Uk,  fki  9k  and  Pk  are  the  search  directions  for  the  state,  control  (with  respect 
to  /  and  <7),  and  adjoint  variables,  respectively.  To  compute  the  right  hand  side  in 
(4.5),  we  solve  the  (nonlinear)  state  and  adjoint  equations  given  by  equations  (4.4a) 
and  (4.4b),  respectively,  for  iterates  fk  and  gk-  To  obtain  the  Gauss-Newton  system 
in  the  inversion  variables  only,  we  eliminate  the  blocks  corresponding  to  the  Newton 
updates  u  and  p  and  obtain 

Uk  =  —  A-1  (Ci  /fe  +  C2  gk), 

Pk  =  -A_tB  uk. 


Thus,  the  reduced  (Gauss-Newton)  Hessian  becomes 


H  = 


Ri 

0 

1 

rcn 

0 

r2 

“T 

cl  J 

A“tBA_1 


Ci  c2  ]  , 


and  the  reduced  linear  system  reads 


(4.7) 


H 

'  fk  ‘ 

=  _ 

'  9f 

.  9k  . 

.  9g  _ 

This  symmetric  positive  system  is  solved  by  the  preconditioned  conjugate  gradient 
method,  where  a  simple  preconditioner  is  given  by  the  inverse  of  the  regularization 
operator  (the  first  block  matrix  in  (4.7)). 

4.1.  Numerical  tests.  Here,  we  present  some  numerical  results  for  the  nonlin¬ 
ear  inverse  problem  (4.1).  The  upper  row  in  Figure  4.1  shows  the  noisy  measurement 
data  (left;  only  the  data  at  the  points  is  used  in  the  inversion),  the  “true”  volume 
source  /  (middle)  and  boundary  source  g  (right)  used  to  construct  the  synthetic  data. 
The  lower  row  depicts  the  results  of  the  inversions  for  the  regularization  parameters 
71  =  10-5  and  72  =  10-4.  The  middle  and  right  plots  on  the  same  row  show  the 
reconstruction  for  /  and  g ,  and  the  left  plot  shows  the  state  solution  corresponding 
to  these  reconstructions.  Note  that  the  regularization  for  the  volume  source  /  leads 
to  a  smooth  reconstruction.  The  L2-regularization  for  the  boundary  source  g  favors 
reconstructions  with  small  L2-norm  but  does  not  prevent  oscillations. 

The  optimality  system  (4.3)  is  solved  using  an  inexact  Gauss-Newton-CG  method 
as  described  in  Section  2.6.  The  inversion  is  based  on  56  measurement  points  (out  of 
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Fig.  4.1.  Results  for  the  nonlinear  elliptic  source  inversion  problem  with  c  =  103?  71  =  10-5 
and  72  =  10-4,  and  1%  noise  in  the  synthetic  data.  Noisy  data  u d  and  recovered  solution  u  (left 
column),  “true”  volume  source  and  recovered  volume  source  f  (center  column),  and  “ true ”  boundary 
source  and  recovered  boundary  source  g  (right  column). 


which  more  than  half  are  located  near  the  boundary  to  facilitate  the  inversion  for  the 
boundary  field).  The  mesh  consisted  of  1206  triangular  elements,  and  we  used  linear 
Lagrange  elements  for  the  discretization  of  the  source  fields  and  quadratic  elements 
for  the  state  and  the  adjoint.  The  iterations  are  terminated  when  the  L2-norms  of  the 
/-gradient  (g^)  and  the  ^-gradient  (gg)  drop  below  10-7.  For  this  particular  example, 
the  number  of  Gauss-Newton  iterations  was  11,  and  the  total  number  of  CG  iterations 
(with  an  adaptive  tolerance  for  the  CG  solve  ranging  from  5  x  10“ 1  to  6  x  10-4)  was 
177.  This  amounted  to  a  total  number  of  11  nonlinear  state  and  linear  adjoint  solves 
(one  at  each  Gauss-Newton  iteration),  and  177  (linear)  state-adjoint  solves  (one  at 
each  CG  iteration). 

5.  Extensions  and  other  frequently  occurring  features.  The  model  prob¬ 
lems  used  in  this  report  illustrate  several,  but  certainly  not  all  important  features 
that  arise  in  inverse  problems  with  PDEs.  A  few  more  typical  properties  that  are 
common  in  inverse  problems  governed  by  PDEs,  which  have  not  been  covered  by  our 
model  problems  are  mentioned  next. 

If  the  inversion  field  is  expected  to  have  discontinuities  but  is  otherwise  piece- 
wise  smooth,  the  use  of  non-quadratic  regularization  is  preferable  over  the  quadratic 
gradient  regularization  used  in  (2.1).  The  total  variation  regularization 

TV{a )  :=  [  |  Va|  dx 

Jn 

for  a  parameter  field  a  defined  on  Q  can,  in  this  case,  result  in  improved  reconstruc¬ 
tions  since  TV  (a)  is  finite  at  discontinuities,  while  quadratic  gradient  regularization 
becomes  infinite  at  discontinuities.  Thus,  with  quadratic  gradient  regularization  dis¬ 
continuities  are  smoothed,  while  they  often  can  be  reconstructed  when  using  TV(-) 
as  regularization. 

Another  frequently  occurring  aspect  in  inverse  problems  is  that  data  from  multiple 
experiments  is  available,  which  amounts  to  an  optimization  problem  with  several  PDE 
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constraints  (each  PDE  corresponding  to  an  experiment),  and  makes  the  computation 
of  first-  and  second-order  derivatives  costly. 

Finally,  we  mention  that  many  inverse  problems  involve  vector  systems  as  state 
equations,  which  can  make  the  derivation  of  the  corresponding  optimality  systems 
more  involved  compared  to  the  scalar  equations  used  in  our  model  problems. 

Appendix  A.  Implementation  of  model  problems.  In  this  appendix  our 
implementations  for  solving  the  model  problems  presented  in  this  paper  are  summa¬ 
rized.  While  in  the  following  we  use  the  COMSOL  Multiphysics  (v3.5a)  [10]  finite 
element  package  (and  the  MATLAB  syntax),  the  implementations  will  be  similar  in 
other  toolkits.  In  this  section,  we  describe  parts  of  the  implementation  in  detail. 
Complete  code  listings  are  provided  in  Appendix  B. 

A.l.  Steepest  descent  method  for  linear  elliptic  inverse  problem.  Our 

implementation  starts  with  specifying  the  geometry  and  the  mesh  (lines  2  and  3),  and 
with  defining  names  for  the  finite  element  basis  functions,  their  polynomial  order  and 
the  regularization  parameter  (lines  4-8).  In  this  example,  we  use  linear  elements  on 
a  hexahedral  mesh  for  the  coefficient  a  (and  for  the  gradient),  and  quadratic  finite 
element  functions  for  the  state  u ,  the  adjoint  p  and  the  desired  state  Ud-  The  latter 
is  used  to  compute  and  store  the  synthetic  measurements,  which  are  computed  from 
a  given  coefficient  atrue  (defined  in  line  6). 

2  fern  .  geom  =  rect  2  ( 0  , 1  ,  0 , 1 ) ; 

3  fem  .  mesh  =  meshmap  ( fem  ,  5  Edgelem  ’  ,{1  ,20  ,2  ,20}); 

4  fem .  dim  =  {’a’  ’grad’  ’p’  ’u’  ’ud’}; 

5  fem .  shape  =  [1  1  2  2  2]; 

6  fem  .  equ  .  expr  .  atrue  =  ’l  +  7*(  sqrt  ( (x  —  0.5)  "2  +  (y  —  0.5)  ~  2)  >0.2)  ’ ; 

7  fem  .  equ  .  expr  .  f  =  5 1  5 ; 

8  fem  .  equ  .  expr  . gamma  =  ’le— 9’; 


Homogeneous  Dirichlet  boundary  conditions  for  the  state  and  adjoint  equations  are 
used.  In  line  14  the  conditions  u  =  0,  p  =  0  and  Ud  —  0  on  dQ  are  enforced.  The  weak 
form  for  the  state  equation  with  the  target  coefficient  atrue,  which  is  used  to  compute 
the  synthetic  measurements,  is  given  in  line  15,  followed  by  the  state  equation  with 
the  unknown,  to-be-reconstructed  coefficient  a  (line  16).  Lines  17-19  define  the  weak 
forms  of  the  adjoint  and  the  gradient  equations,  respectively.  Note  that  in  COMSOL 
Multiphysics,  variations  (or  test  functions)  are  denoted  by  adding  “_test”  to  the 
variable  name. 


14 

15 

16 

17 

18 
19 


fem  .  bnd  .  r  =  {{’u’  ’p’  ’ud’}}; 

fem  .  equ  .  expr  .  goal  =  ’  — (  at  rue  *(  udx*  udx  _test+udy*  udy  .test )—  f  *  ud  .test  )  ’; 
fem  .  equ  .  expr  .  state  =  ’  —  (a*  ( ux*  ux  _t  est+uy  *  uy  _t  est )—  f  *  u  _t  es  t  )  ’ ; 
fem  .  equ  .expr.  adjoint  =  ’  —  (a*  ( px*  px  _t  est+py  *  py_test)  —  (ud— u)  *  p  _t  est  )  ’ ; 
fem  .  equ  .  expr  .  control  =  [’(  grad*  grad_test  —gamma*  (  ax*  gradx_test  ’  ... 

’  +  ay*grady_test)  —  (px*ux+py  *uy )  *  grad_test)  ’  ]  ; 


To  synthesize  measurement  data,  the  state  equation  with  the  given  coefficient  atrue 
is  solved  (lines  20-22). 

20  fem  .  equ  .  weak  if  ’goal 

21  fem  .  xmesh  =  meshextend  ( fem  )  ; 

22  fem. sol  =  femlin(fem,  ’Solcomp’,  {’ud’}); 


COMSOL  allows  the  user  to  access  its  internal  finite  element  structures  such  as  the 
degrees  of  freedom  for  each  finite  element  function.  Our  implementation  of  the  steep¬ 
est  descent  iteration  works  on  the  finite  element  coefficients,  and  the  indices  for  the 
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degrees  of  freedom  are  extracted  from  the  finite  element  data  structure  in  the  lines 
23-29.  Note  that  internally  the  unknowns  are  ordered  alphabetically  (independently 
from  the  order  given  in  line  4).  Thus,  the  indices  for  the  finite  element  function  a 
can  be  extracted  from  the  first  column  of  dof  s;  see  line  25.  If  different  order  element 
functions  are  used  in  the  same  finite  element  structure  (in  the  present  case,  linear  and 
quadratic  polynomials),  COMSOL  pads  the  list  of  indices  for  the  lower-order  function 
with  zeros.  These  zero  indices  are  removed  by  only  choosing  the  positive  indices  (lines 
25-29).  The  index  vectors  are  used  to  access  the  entries  in  X,  the  vector  containing 
all  finite  element  coefficients,  which  can  be  accessed  as  shown  in  line  30. 


We  add  noise  to  our  synthetic  data  (line  32)  to  lessen  the  “inverse  crime”  [18],  which 
occurs  due  to  the  fact  that  the  same  numerical  method  is  used  in  the  inversion  as  for 
creating  the  synthetic  data. 


32 


X(UDI)  =  X(UDI)  +  datanoise  *  max(  abs  (X(UDI ) ) )  *  randn  ( length  (UDI)  ,  1 ) ; 


We  initialize  the  coefficient  a  (line  33;  the  initialization  is  a  constant  function)  and 
(re-)define  the  weak  form  as  the  sum  of  the  state,  adjoint,  and  control  equations 
(line  34).  Note  that,  since  the  test  functions  for  these  three  weak  forms  differ,  one 
can  regain  the  individual  equations  by  setting  the  appropriate  test  functions  to  zero. 
To  compute  the  initial  value  of  the  cost  functional,  in  line  36  we  solve  the  system 
with  respect  to  u  only.  For  the  variables  not  solved  for,  the  finite  element  functions 
specified  in  X  are  used.  Then,  the  solution  of  the  state  equation  is  copied  into  X,  and 
is  used  in  the  evaluation  of  the  cost  functional  (lines  37  and  38). 

33 

34 

35 

36 

37 

38 


X(AI)  =  8.0; 

fem  .  equ  .  weak  =  ’state  +  adjoint  +  control 
fem  .  xmesh  =  meshextend  ( fem  )  ; 

fem. sol  =  femlin(fem,  ’Solcomp’,  f’u’},  ’U’  ,  X)  ; 

X(UI)  =  fem.  sol  .u(UI); 
cost_old  =  evaluate  .cost  (fem,  X); 


Next,  we  iteratively  update  a  in  the  steepest  descent  direction.  For  a  current  iterate 
of  the  coefficient  a,  we  first  solve  the  state  equation  (line  40)  for  u.  Given  the  state 
solution,  we  solve  the  adjoint  equation  (line  42)  and  compute  the  gradient  from  the 
control  equation  (line  44).  A  line  search  to  satisfy  the  Armijo  descent  criterion  [22] 
is  used  (line  56).  If,  for  a  step  length  <a,  the  cost  is  not  sufficiently  decreased,  back¬ 
tracking  is  performed,  be.,  the  step  length  is  reduced  (line  61).  In  each  backtracking 
step,  the  state  equation  has  to  be  solved  to  evaluate  the  cost  functional  (lines  53-54). 


39 

40 

41 

42 

43 

44 

45 

46 


for  iter  =  l:maxiter 

fem.  sol  =  femlin(fem,  ’Solcomp’,  {  ’  u  ’  }  ,  ’U’  ,  X); 

X(UI)  =  fem.  sol  .u(UI); 

fem.  sol  =  femlin(fem,  ’Solcomp’,  {  ’  p  ’  }  ,  ’U’  ,  X); 

X(PI)  =  fem.  sol  .  u  ( PI )  ; 

fem.  sol  =  femlin(fem,  ’Solcomp’,  {’grad’},  ’U’  ,  X)  ; 

X(GI)  =  fem.  sol  .u(GI); 

grad2  =  postint  (fem,  ’grad  *  grad’); 
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47 


48 

49 

50 

51 

52 

53 

54 

55 

56 

57 

58 

59 

60 
61 
62 

63 

64 

65 

66 

67 

68 

69 

70 

71 


Xtry  =  X; 

alpha  =  alpha  *  1.2; 

descent  =  0; 
no.backtrack  =  0; 

while  ('descent  &&  no.backtrack  <  10) 

Xtry(AI)  =  X(AI)  -  alpha  *  X(GI); 

fem.sol  =  femlin(fem,  ’Solcomp’,  {  ’  u  ’ }  ,  ’U’  ,  Xtry); 

Xtry(UI)  =  fem  .  so  1  .  u  ( UI )  ; 

[cost  ,  misfit  ,  reg]  =  e valuat e _cost  (fem,  Xtry); 
if  (cost  <  cost_old  —  alpha  *  c  *  grad2) 
cost  _old  =  cost ; 
descent  =  1; 

else 

no_backtrack  =  no_backtrack  +  1; 
alpha  =  0.5  *  alpha; 

end 

end 

X  =  Xtry; 

fem.sol  =  femsol  (X)  ; 
if  (sqrt(grad2)  <  tol  &&  iter  >  1) 

fprintf  ([  ’  Gradient  method  converged  after  %d  iterations  .  ’  .  .  . 
’\n\n  ’]  ,  iter  )  ; 

break  ; 
end 
end 


The  steepest  descent  iteration  is  terminated  when  the  norm  of  the  gradient  is  suffi¬ 
ciently  small.  The  complete  code  listing  of  the  implementation  of  the  gradient  method 
applied  to  solve  problem  (2.1)  can  be  found  in  Appendix  B.l. 

A. 2.  Gauss-Newton  CG  method  for  linear  elliptic  inverse  problem.  Dif¬ 
ferently  from  the  implementation  of  the  steepest  descent  method,  which  relies  to  a 
large  extent  on  solvers  provided  by  the  finite  element  package,  this  implementation 
makes  explicit  use  of  the  discretized  operators  corresponding  to  the  state  and  the 
adjoint  equations.  For  brevity  of  the  description,  we  skip  steps  that  are  analogous  to 
the  steepest  descent  method  and  refer  to  Appendix  B.2  for  a  complete  listing  of  the 
implementation. 

After  setting  up  the  mesh,  the  finite  element  functions  a,  u  and  p  corresponding 
to  the  coefficient,  state  and  adjoint  variables,  as  well  as  their  increments  a,u  and  p, 
are  defined  in  lines  4  and  5. 


fem .  dim  =  {’a’  ’a0’  ’delta_a  ’  ’delta.p  ’ 

’  delt a_u  ’ 

’P’ 

’ll’ 

’ud’}; 

fem.  shape  =  [11122222]; 

Homogeneous  Dirichlet  conditions  are  used  for  the  state  and  adjoint  variable,  as  well 
as  for  their  increment  functions;  see  line  16.  After  initializing  parameters,  the  weak 
forms  for  the  construction  of  the  synthetic  data  (lines  17  and  18),  and  the  weak  forms 
for  the  Gauss-Newton  system  are  defined  (lines  19-24). 

16 

17 

18 

19 

20 

21 

22 

23 

24 


fem 

bnd  . 

r  = 

{  {  5  delt  a_u  ’  ’delta 

_p  ’  ’u  ’  ’ud  ’  }  }  ; 

fem 

equ  . 

expr 

.goal  =  ’  —  (  at  rue  * 

( udx*  udx  _t  est+udy 

*udy_test)— f*ud_ 

t est )  ’ ; 

fem 

equ  . 

expr 

.state  =  [  ’  —  (a*(ux*  ux_test+uy*  uy  _t 

est )—  f  *  u  _t  est  )  ’  ] 

; 

fem 

equ  . 

expr 

.  incst ate  =  [  ’  — (a 

*(delta_ux*  delta_ 

px  _t  est+delta_uy 

’  .  .  . 

’*  de 

lta_py_test)+delt 

a_a*(ux*delta_px_ 

test  +uy  *delta_py 

_test))  ’  ]  ; 

fem  . 

.  equ  . 

expr 

.  incadj  oint  =  [  ’  — 

(a*(delta_px*delt 

a.ux.t  est  + 

delt 

;a_py 

*  del 

ta_uy_test)+delta 

_u  * delta_u_test  ) 

]; 

fem  . 

.  equ  . 

expr 

.  inccont rol  =  [  ’  — 

(gamma*  (delta_ax* 

de  It  a_ax  _t  es  t+de 

lta_ay  ’  .  .  . 

’  *  d  e 

lta_ay_test )  +  (  delta_px  *ux+delt  a_py 

*uy)*  delta_a_tes 

O’]; 
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As  in  the  implementation  of  the  steepest  descent  method,  the  synthetic  data  is  based 
on  a  “true”  coefficient  atrue,  and  noise  is  added  to  the  synthetic  measurements; 
the  corresponding  finite  element  function  is  denoted  by  ud.  The  indices  pointing  to 
the  coefficients  for  each  finite  element  function  in  the  coefficient  vector  X  are  stored 
in  AI,dAI,dPI,. . . .  After  setting  the  coefficient  a  to  a  constant  initial  guess,  the 
reduced  gradient  is  computed,  which  amounts  to  the  right  hand  side  in  the  reduced 
Hessian  equation.  For  that  purpose,  the  state  equation  (lines  50-52)  is  solved. 

50  fem  .  xmesh  =  meshextend  ( fem  )  ; 

51  fem.  sol  =  femlin(fem,  ’Solcomp’,  {’ii5}}  ’U’  ,  X); 

52  X(UI)  =  fem.  sol  .u(UI); 

Next,  the  KKT  system  is  assembled  in  line  57.  Note  that  the  system  matrix  K  does 
not  take  into  account  Dirichlet  boundary  conditions.  These  conditions  are  enforced 
through  the  constraint  equation  N*X=M  (where  N  and  M  are  the  left  and  right  hand 
sides  of  the  boundary  conditions  and  can  be  returned  by  the  assemble  function). 

56  fem  .  xmesh  =  meshextend  ( fem  )  ; 

57  [K,  N]  =  assemble  (fem  ,  ’U’  ,  X,  ’out’,  { ’K%  ’N’ } )  ; 

Our  implementation  explicitly  uses  the  individual  blocks  of  the  KKT  system — these 
blocks  are  extracted  from  the  KKT  system  using  the  index  vectors  dAI,  dUI,  dPI; 
see  lines  58-61.  Note  that  the  choice  of  the  test  function  influences  the  location  of 
these  blocks  in  the  KKT  system  matrix.  Since  Dirichlet  boundary  conditions  are 
not  taken  care  of  in  K  (and  thus  in  the  matrix  A,  which  corresponds  to  the  state 
equation),  these  constraints  are  enforced  by  a  modification  of  A  (see  lines  58-61).  The 
modification  puts  zeros  in  rows  and  columns  of  Dirichlet  nodes  and  ones  into  the 
diagonals;  see  lines  62-68.  Additionally,  changes  to  the  right  hand  sides  are  made 
using  a  vector  chi,  which  contains  zeros  for  Dirichlet  degrees  of  freedom  and  ones  in 
all  other  components  (lines  69-70). 

58  W  —  K(dUI ,  dUI )  ; 

59  A  =  K(dUI ,  dPI ) ; 

60  C  —  K(  dPI  ,  dAI )  ; 

61  R  =  K(dAI ,  dAI )  ; 

62  ind  =  find  (sum(N( :  ,  dUI),l)~  =  0); 

63  A ( :  ,  ind  )  =  0 ; 

64  A(  ind  ,  :  )  =  0 ; 

65  for  (k  =  1 :  length  ( ind  ) ) 

66  i  =  ind  (k  )  ; 

67  A(  i  ,  i )  =  1 ; 

68  end 

69  chi  =  ones  (  size  (A,  1 )  ,  1 )  ; 

70  chi(ind)  =  0; 

An  alternative  approach  to  eliminate  the  degrees  of  freedom  corresponding  to  Dirichlet 
boundary  conditions  is  to  use  a  null  space  basis  of  the  constraints  originating  from 
the  Dirichlet  conditions  (or  other  essential  boundary  conditions,  such  as  periodic 
conditions).  COMSOL’s  function  femlin  can  be  used  to  compute  the  matrix  Null, 
whose  columns  form  a  basis  of  the  null  space  of  the  constraint  operator  (z.e.,  the 
left  hand  side  matrix  of  the  constraint  equation  N*X=M).  For  instance,  the  following 
lines  show  how  to  eliminate  the  Dirichlet  boundary  conditions  and  solve  the  adjoint 
problem  using  this  approach. 
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[Ke  ,  Le  ,  Null  ,  ud]  =  femlin  (  ’  in  5  ,  {  ’K’  K(PI,PI)  ’L’  L(PI)  ’M’  M  ’N 
N ( :  >  PI ) } ) 5 

X(PI)  =  Null  *  (Ke  \  Le); 


Note  that  the  assemble  function  linearizes  nonlinear  weak  forms  at  the  provided  lin¬ 
earization  point  (X  in  our  case — the  linearization  point  is  zero  if  it  is  not  provided). 
By  setting  the  adjoint  variable  to  zero,  we  avoid  unwanted  contributions  to  the  lin¬ 
earized  matrices.  In  the  above  code  snippet,  we  use  femlin  to  provide  a  basis  of  the 
constraint  space,  which  allows  us  to  compute  the  solution  in  the  (smaller)  constraint 
space.  This  solution  Ke\Le  is  then  lifted  to  the  original  space  by  multiplying  it  with 
Null. 

After  this  side  remark,  we  continue  with  the  description  of  the  Gauss-Newton- 
CG  method  description  for  the  elliptic  parameter  estimation  problem.  Since  the  state 
operator  A  has  to  be  inverted  repeatedly,  we  compute  its  Choleski  factorization  be¬ 
forehand;  see  line  71.  Now,  to  compute  the  right  hand  side  of  the  Gauss-Newton 
system,  the  adjoint  equation  can  be  solved  by  a  simple  forward  and  backward  elimi¬ 
nation  (line  72).  The  resulting  adjoint  is  used  to  compute  the  reduced  gradient  (line 
73).  Note  that  MG  denotes  the  coefficients  of  the  reduced  gradient,  multiplied  by  the 
mass  matrix,  which  corresponds  to  the  right  hand  side  in  the  reduced  Gauss-Newton 
equation  (see  equation  (2.14)  in  Section  2.5). 


71 


72 

73 


AF  =  chol(A); 

X(PI)  =  AF  \  (AF5  \  (chi  .  *  ( W  *  (X(UDI)  -  X(UI ) ) )  ) )  ; 

MG  =  C’  *  X(PI)  +  R  *  X(AI)  -  mu  *  (1./  (X(AI)  -  X(AOI))); 


To  solve  the  reduced  Hessian  system  and  obtain  a  descent  direction,  we  use  MAT- 
LAB’s  conjugate  gradient  function  peg.  Thereby,  the  function  elliptic_apply_logb, 
which  is  specified  below,  implements  the  application  of  the  reduced  Hessian  to  a  vec¬ 
tor.  The  right  hand  side  in  the  system  is  given  by  the  negative  gradient  multiplied 
by  the  mass  matrix.  We  use  a  loose  tolerance  early  in  the  CG  iteration,  and  tighten 
the  tolerance  as  the  iterates  get  closer  to  the  solution,  as  described  in  Section  2.6  (see 
line  80). 

80 
81 
82 


tolcg  =  min(0.5,  sqrt  ( gradnorm/ gradnorm_ini  ) )  ; 

P  =  R  +  le  — 10*eye  ( length  (AI ))  ; 

[D,  flag  ,  relres  ,  CGiter  ,  resvec  ]  =  peg  (@(D)  e  1 1  i p  t  i c  _ap p ly  _1  o g b 

(V,  chi,  W,  AF,  C,  R,  X,  AI,  AOI  ,  mu), -MG,  tolcg,  300,  P 


The  vector  D  resulting  from  the  (approximate)  solution  of  the  reduced  Hessian  system 
is  used  to  update  the  coefficients  of  the  FE  function  a.  A  line  search  with  an  Armijo 
criterion  is  used  to  globalize  the  Gauss-Newton  method.  In  the  presence  of  inequality 
constraints,  we  render  the  logarithmic  barrier  parameter  mu  to  a  positive  value  (e.g., 
mu=le-10)  and  check  for  constraint  violations  (lines  89-99).  This  is  done  by  computing 
the  maximum  allowable  step  length  in  the  line  search  to  maintain  X(AI)  >  X(AOI) 
(lines  97-98). 


89 

90 

91 

92 

93 

94 

95 

96 

97 

98 


idx  =  find  (D  <  0); 

Aviol  =  X( AI)  -  X(A0I)  ; 
if  min(Aviol)  <=  le— 9 

error(’point  is  not  feasible  ’ ) ; 

end 

if  ( isempty  ( idx  ) ) 
alpha  =  1; 
else 

alphal  =  min(  —  Aviol  ( idx  ). /D(  idx  ))  ; 
alpha  =  min  ( 1  ,  0.9995*  alphal  )  ; 
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end 


We  conclude  the  description  of  the  Gauss-Newton  implementation  with  giving 
the  function  elliptic_apply_logb  for  both  approaches  to  incorporate  the  Dirichlet 
boundary  conditions.  This  function  applies  the  reduced  Hessian  to  a  vector  D.  Note 
that,  since  the  precomputed  Choleski  factor  AF  is  triangular,  solving  the  state  and  the 
adjoint  equation  only  amounts  to  two  forward  and  two  backward  elimination  steps. 


fun< 

:t  ion 

GNV  =  e 

1 1  i  p  t  i  c 

-apply- 

logb  (V,  chi  , 

W,  AF,  C,  R,  X,  AI,  AOI,  mu) 

du  = 

=  AF 

\  (AF’  \ 

(chi  . 

*  (— C  * 

V))); 

dp  = 

=  AF 

\  (AF’  \ 

(chi  . 

*  (-W  * 

du))); 

Z  = 

mu  * 

spdiags 

(i  •/ 

(X(AI) 

-  X(  AOI ) )  •  ~  2 

,  0,  length(AI),  length(AI)); 

GNV 

=  C’ 

*  dp  + 

(R  +  Z) 

*  V; 

The  second  approach,  which  uses  the  basis  of  the  constraint  null  space  Null,  reads: 


fun< 

:t  ion 

GNV  = 

el 

1  i  p  t  i  c 

.apply  _logb  (V, 

Null,  W,  AF,  C,  R,  X,  AI,  AOI,  mu) 

du  = 

=  AF 

\  (AF’ 

\ 

(Null  ’ 

*  (-C  *  V))); 

dp  = 

=  AF 

\  (AF’ 

\ 

(Null  ’ 

*  (-W  *  (Null 

*  du)))); 

Z  = 

mu  * 

spdiags  ( 

1  •/ 

(X(AI)-X(AOI)). 

"2,  0,  length(AI),  length(AI)); 

GNV 

=  C’ 

*  ( Nu 

11 

*  dp ) 

+  (R  +  Z)  *  V; 

A. 3.  Conjugate  gradient  method  for  inverse  advective-diffusive  trans¬ 
port.  Here  we  summarize  implementation  aspects  that  are  particular  to  the  inverse 
advective-diffusive  transport  problem.  The  matrices  corresponding  to  the  stationary 
advection-diffusion  operator  and  the  mass  matrix  are  assembled  similarly  as  in  the 
elliptic  model  problem.  The  discretization  of  the  measurement  operator  Q  is  a  mass 
matrix  for  the  boundary  Tm.  To  obtain  this  matrix,  we  set  the  weak  form  corre¬ 
sponding  to  Q  to  zero  (line  49) ,  and  use  a  mass  matrix  for  the  boundary  Tm  only  (line 
50). 

49 

50 

51 

52 

53 

The  most  interesting  part  of  the  implementation  for  this  time-dependent  problem  is 
the  application  of  the  state-adjoint  solve  (see  Section  3.1),  which  is  discussed  next. 
The  function  ad_apply  that  applies  the  left  hand  side  from  (3.5)  to  an  input  vector  U0 
is  listed  below  in  lines  1-17.  The  time  steps  for  the  state  variable  UU  and  the  adjoint 
PP  are  stored  in  columns,  by  L  we  denote  the  discrete  advection-diffusion  operator, 
and  by  M  the  domain  mass  matrix.  The  lines  5-8  correspond  to  the  solution  of  the 
state  equation,  and  lines  9-16  to  the  solution  of  the  adjoint  equation,  which  is  solved 
backwards  in  time.  The  factor  computed  in  line  12  controls  if  measurements  are  taken 
for  time  instances  or  not.  Note  that  the  adjoint  equation  is  solved  using  the  discrete 
adjoint  scheme  (in  space  and  time)  we  described  in  Section  3.1. 

1  function  out  =  ad_apply(U0,  L,  M,  R,  Q,  dt  ,  Tm,  ntime  ,  gammal ,  gamma2 , 

2  UU,  PP) 

3  global  cgcount ; 

4  cgcount  =  cgcount  +  1; 

s  UU ( :  ,1)  =  U0; 

6  for  k  =  1:  (ntime—  1) 

7  UU  ( :  ,  k+1)  =  L  \  (M  *  UU  ( :  ,  k  ) ) ; 

8  end 

9  F  =  Q  *  (—  dt  *  UU ( :  ,  ntime  ))  ; 


fern  .  equ  .  weak  =  ’O’; 

fern  .  bnd  .  weak  =  {{}  {}  {}  —  u  *  u_test  ’}  {}}; 

fern  .  xmesh  =  meshextend  ( fern  )  ; 

MB  =  assemble  ( fern  ,  ’out’,  ’K’)> 

Q  =  MB(UI,UI); 
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11 

12 

13 

14 

15 

16 
17 


89 

90 

91 

92 

93 

94 


105 

106 
107 


109 

110 
111 


124 


PP  ( :  ,ntime)  =  L  ’  \  F; 
for  k  =  (  ntime  —  1) :  —  1 :2 

m  =  ( k  >  Tm  *  ntime  )  ; 

F  =  M  *  PP  ( :  ,  k+1)  +  m  *  Q  *  (—  dt  *  UU ( :  ,  k  ) )  ; 

PP(:  ,k)  =  L’  \  F; 

end 

MP( : ,1)  =  M  *  PP(: ,2); 

out  ■  —  MP(:  ,1)  +  gammal  *  M  *  UO  +  gamma2  *  R  *  UO; 

A  complete  code  listing  for  this  problem  can  be  found  in  Appendix  B.3. 

A. 4.  Gauss-Newton-CG  for  nonlinear  source  inversion  problem.  Some 
implementation  details  for  the  nonlinear  inverse  problem  (4.1)  are  described  next. 
Most  parts  of  the  code  are  discussed  above,  hence  we  focus  on  a  few  new  aspects  only. 

We  recall  that  the  inversion  in  (4.1)  is  based  on  discrete  measurement  points 
rather  than  distributed  measurements.  The  discretization  of  this  operator  is  a  mass 
matrix  for  the  data  points  Xj,  j  =  1, ...,  Nr.  To  obtain  this  matrix,  we  set  the  weak 
form  corresponding  to  and  the  boundary  T  to  zero  (lines  89-90)  and  add  a  mass 
matrix  contribution  for  the  data  points  (line  91). 

fem  .  equ  .  weak  =  ’O’; 

fem  .  bnd  .  weak  =  ’O’; 

fem  .  pnt  .  weak  =  {{},{’  —  P  *  P-test  ’}}; 

fem  .  xmesh  =  meshextend  ( fem  )  ; 

K=  assemble  ( fem  ,  ’out’,  ’K’); 

B  =  K(  PI  ,  PI )  ; 

The  second  novelty  in  problem  (4.1)  is  that  we  invert  for  two  fields,  /  and  g,  where 
g  is  defined  on  the  boundary.  Therefore,  the  ^-gradient  (control)  equation  is  defined 
on  the  boundary.  In  line  105,  we  define  the  weak  forms  corresponding  to  the  state 
equation  and  the  incremental  equations.  We  note  that  the  weak  form  of  the  equation 
for  the  gradient  with  respect  to  g  is  defined  on  the  boundary  (see  line  (106)).  The 
remaining  terms  in  this  definition  correspond  to  the  weak  forms  of  the  second  variation 
of  the  Lagrangian  with  respect  to  p  and  g,  and  the  boundary  weak  term  corresponding 
to  the  state  equation. 

fem . equ .  weak  =  ’state  +  incstate  +  incadjoint  +  inccontrolf 
fem  .  bnd  .  weak  =  {{’—  gamma2*  delt  a_g  *  delt  a_g_test  ’... 

’  —  delta_p*delta_g_test—  delta_g*delta_p_test  —  g*  u_test  7  }  }  ; 

Since  the  state  equation  is  nonlinear,  we  use  COMSOL’s  built-in  nonlinear  solver 
femnlin  (lines  109-111)  to  solve  for  the  state  variable  u.  Alternatively,  one  could 
implement  a  Newton  method  for  the  solution  of  the  nonlinear  (state)  equation  instead 
of  relying  on  femnlin. 

fem  .  xmesh  =  meshextend  ( fem  )  ; 

fem.  sol  =  femnlin(fem,  ’  Solcomp  ’  ,  f’u’},  ’U’  ,  X,  ’ntol’,  le— 9); 

X(UI)  =  fem.  sol  .u(UI); 

Next,  we  compute  the  Choleski  factor  of  A  (note  that  the  state  operator  is  always 
positive  definite),  the  matrix  corresponding  to  the  state  operator  and  use  it  to  compute 
the  adjoint  solution  by  a  forward  and  a  backward  elimination  (line  124),  where  B  is 
computed  in  line  94. 

X(PI)  =  AF  \  (AF’  \  (B  *  (X(UDI)  -X(UI)))); 

Finally,  we  depict  the  function  nl_apply,  which  applies  the  reduced  Hessian  to  a 
vector  as  described  in  Section  4.  Note  that  the  control  equation  is  a  system  for 
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/  and  g ,  hence  the  apply  function  computes  the  action  of  the  Hessian  on  a  vector 
corresponding  to  both  updates  fk  (denoted  by  V\)  and  gk  (denoted  by  V2)  (lines  7- 
8)- 


func 

don  GNV  =  nl_a 

pply (v,  B, 

AF, 

Cl,  C2,  Rl,  R2) 

[  n  ,m 

=  size (R1 )  ; 

VI  = 

V(  1 :  n ) ; 

V2  = 

V(n  +  l:end ) ; 

du  = 

AF  \  (AF’  \  (- 

Cl 

*  VI  - 

C2  * 

V2)); 

dp  = 

AF  \  (AF1  \  (- 

B  * 

du)); 

GNV1 

=  Cl ’  *  dp  +  R1  * 

VI; 

GNV2 

=  C2 ’  *  dp  +  R2  * 

V2 ; 

GNV  = 

[GNVl;  GNV2]  ; 

The  complete  code  listing  for  this  problem  can  be  found  in  Appendix  B.4. 

Appendix  B.  Complete  code  listings. 

B.l.  Steepest  descent  for  elliptic  inverse  problem.  The  complete  imple¬ 
mentation  of  the  steepest  descent  method  to  solve  the  elliptic  parameter  inversion 
problem  (2.1)  is  given  below.  Parts  of  the  implementation  are  discussed  in  Appendix 
A.l. 

1  clear  all;  close  all; 

2  fern  .  geom  =  rect  2  ( 0 , 1  , 0 , 1 ) ; 

3  fem  .  mesh  =  meshmap  ( fem  ,  ’Edgelem’,  {1,20,2,20}); 

4  fem .  dim  =  {’a’  ’grad’  ’p’  ’u’  ’ud’}; 

5  fem .  shape  =  [1  1  2  2  2]; 

6  fem  .  equ  .  expr  .  atrue  =  T  +  7*(  sqrt  ( (x  —  0.5)  ~2  +  (y  —  0.5)~2)  >  0.2)’; 

7  fem  .  equ  .  expr  .  f  =  ’1’; 

8  fem  .  equ  .  expr  .  gamma  =  ’  1  e  —  9  ’ ; 

9  datanoise  =  0.01; 

10  maxiter  =  500; 

11  tol  =  le— 8; 

12  c  =  le— 4; 

13  alpha  =  le5  ; 

14  fem  .  bnd .  r  =  {{’u’  ’p’  ’ud’}}; 

15  fem  .  equ  .  expr  .  goal  =  ’  — (  atrue  *( udx*  udx _test+udy*  udy_test )  —  f  *  ud -test  )’ ; 

16  fem  .  equ  .  expr  .  state  =  ’  —  (a*  ( ux*  ux  _t  est+uy  *  uy  _t  est )—  f  *  u  _t  es  t  )  ’ ; 

17  fem  .  equ  .  expr  .  adj  oint  =  ’  —  (a*  ( px*  px _t est+py*  py  _t  est )  —  (ud— u)  *  p  _t  est  )  ’ ; 
is  fem  .  equ  .  expr  .  control  m  [’(  grad*  grad_test —gamma*  (  ax*  gradx_test+ay  ’..  . 

19  ’*  grady.t  est ) —  (px*ux+py*uy)  *  grad  .test  )  ’]; 

20  fem  .  equ  .  weak  =  ’goal  ’; 

21  fem  .  xmesh  =  meshextend  ( fem  )  ; 

22  fem.  sol  =  femlin(fem,  ’Solcomp’,  {’ud’}); 

23  nodes  =  xmeshinfo  ( fem  , ’out’,  ’nodes’); 

24  dofs  =  nodes  .  dofs  ’  ; 

25  AI  =  dofs  (  dofs  (:,  1  )>  0 , 1 ) ; 

26  GI  =  dofs  (dofs  (:  ,2)  >0  ,2); 

27  PI  m  dofs  (  dofs  ( :  ,3)  >0  ,3) ; 

28  UI  =  dofs  (dofs  (:  ,4)  >0  ,4); 

29  UDI  =  dofs(dofs(:,5)>0,5); 

30  X  —  fem  .  sol  .  u  ; 

31  randn  (  ’  seed  ’  ,  0 ) ; 

32  X(UDI)  =  X(UDI)  +  datanoise  *  max(  abs  (X(UDI ) ) )  *  randn  ( length  (UDI)  ,  1 ) ; 

33  X(  AI)  =  8.0; 

34  fem .  equ .  weak  =  ’state  +  adjoint  +  control  ’; 

35  fem  .  xmesh  =  meshextend  ( fem  )  ; 

36  fem.  sol  =  femlin(fem,  ’Solcomp’,  {’u’},  ’U’  ,  X); 

37  X(UI)  =  fem.  sol  .u(UI); 

38  cost_old  =  evaluate _cost  (fem,  X); 

39  for  iter  =  l:maxiter 
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fern. sol  =  femlin(fem,  ’Solcomp’,  {  ’  u  ’  }  ,  ’U’  ,  X); 

X(UI)  =  fem.  sol  .u(UI); 

fem.sol  =  femlin(fem,  ’  Solcomp  ’  ,  {  ’  p  ’  }  ,  ’U’  ,  X); 

X(PI)  =  fem.  sol  .  u  (  PI )  ; 

fem.sol  =  femlin(fem,  ’Solcomp’,  {’grad’},  ’U’  ,  X); 

X(GI)  =  fem.  sol  .u(GI); 

grad2  =  postint  (fem,  ’grad  *  grad’); 

Xtry  =  X; 

alpha  =  alpha  *  1.2; 

descent  =  0; 
no.backtrack  =  0; 

while  (“descent  &&  no.backtrack  <  10) 

Xtry(AI)  =  X(AI)  -  alpha  *  X(GI); 

fem.sol  =  femlin(fem,  ’Solcomp’,  {  ’  u  ’  }  ,  ’U’  ,  Xtry); 

Xtry(UI)  =  fem  .  so  1  .  u  ( UI )  ; 

[cost  ,  misfit  ,  reg]  =  evaluate.cost  (fem,  Xtry); 
if  (cost  <  cost_old  —  alpha  *  c  *  grad2) 
cost  _old  =  cost ; 
descent  =  1; 

else 

no_backtrack  =  no_backtrack  +  1; 
alpha  =  0.5  *  alpha; 

end 

end 

X  =  Xtry; 

fem.sol  =  femsol  (X)  ; 
if  (sqrt(grad2)  <  tol  &&  iter  >  1) 

fprintf  ([’Gradient  method  converged  after  %d  iterations.  ’... 
’\n\n  ’]  ,  iter  )  ; 

break  ; 

end 

end 

The  cost  function  evaluation  for  given  solution  vector  X  is  listed  below: 

function  [cost  ,  misfit  ,  reg]  =  evaluate  .cost  (fem  ,  X) 
fem.sol  =  femsol(X); 

misfit  =  postint  (fem,  ’0.5  *  (u  —  ud ) ~ 2  ’ ) ; 

reg  =  postint  (fem,  ’0.5  *  gamma  *  ( ax  "2+ ay  “  2 )  ’); 

cost  =  misfit  +  reg  ; 


B.2.  Gauss-Newton-CG  for  linear  elliptic  inverse  problems.  Below,  we 
give  the  complete  COMSOL  Multiphysics  implementation  of  the  Gauss-Newton-CG 
method  to  solve  the  elliptic  coefficient  inverse  problem  described  in  Section  2. 

1  clear  all;  close  all; 

2  fem  .  geom  =  rect  2  ( 0  , 1  ,  0 , 1 ) ; 

3  fem  .  mesh  =  meshmap(fem,  ’  Edgelem  ’  ,  {1,20,2,20}); 

4  fem .  dim  =  {’a’  ’a0’  ’delta_a  ’  ’delta.p  ’  ’delta.u  ’  ’g’  ’p 

5  fem .  shape  =  [1  1122122  2]; 

6  fem  .  equ  .  expr  .  atrue  =  ’1  +  7*(  sqrt  ( (x  —  0.5)  “2  +  (y  — 0.5)~2) 

7  datanoise  =  0.01; 

8  fem  .  equ  .  expr  .  f  =  ’1’; 

9  fem  .  equ  .  expr  .  gamma  =  ’  1  e  —  9  ’ ; 

10  maxiter  =  100; 
n  tol  =  1  e  —8; 

12  c  =  le— 4; 

13  rho  =  0.5; 

14  mu  =  0; 

15  nrCGiter  =  0; 

16  fem .  bnd .  r  =  {{’delta_u’  ’delta_p’  ’u’  ’ud’}}; 

17  fem  .  equ  .  expr  .  goal  =  ’  —  (  at  rue  *  ( udx*  udx  _t  est+udy  *  udy  _t  est )—  f  *  ud  _t  est  )  ’ ; 


’u’  ’ud’}; 

>  0.2)  ’; 
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is  fern  .  equ  .  expr  .  state  =  5  —  (a*  ( ux*  ux  _t  est+uy  *  uy  _t  est )—  f  *  u  _t  es  t  )  5 ; 

19  fern  .  equ  .  expr  .  incst ate  =  [  ’  —  (a*(  delta_ux*  delt  a_px_t est+delta_uy  *  ’  .  .  . 

20  ’delta_py_test)+delta_a*(ux*delta_px_test  +uy *delta_py_test))  ’  ]  ; 

21  fern  .  equ  .  expr  .  incadj  oint  =  [  ’  —  (a*(  delta_px*  delt  a_ux_test+delta_py  *  ’ .  .  . 

22  ’  de  It  a  _uy  _t  es  t )+  delt  a_u  *  d  e  1 1  a  _u  _t  e  s  t  )  ’]; 

23  fern  .  equ  .  expr  .  inccontrol  =  [  ’  —  (gamma*  (  delt  a_ax  *  de  It  a  _ax  _t  es  t+delt  a_ay  ’  ... 

24  ’  *  d  e  1 1  a  _ay  _t  es  t )  +  (  delt  a_px  *ux+delt  a_py  *uy )  *  d  e  1 1  a  _a  _t  e  s  t  )  ’]; 

25  fern  .  equ  .  weak  =  ’goal 

26  fern  .  xmesh  =  meshextend  ( fern  )  ; 

27  fern,  sol  =  femlin(fem,  ’Solcomp’,  ’ud’); 

28  nodes  =  xmeshinfo  ( fern  , ’out’,  ’nodes’); 

29  dofs  =  nodes  .  dofs  ’  ; 

30  AI  =  dofs  (  dofs  ( :  ,  1 )  >  0 , 1 ) ; 
si  AOI  =  dofs(dofs(:  ,2)  >0,2); 

32  dAI  =  dofs  (  dofs  ( :  ,  3 )  >  0  , 3 ) ; 

33  dPI  =  dofs  (  dofs  ( :  ,  4 )  >  0  , 4 ) ; 

34  dUI  =  dofs  (  dofs  ( :  ,  5 )  >  0  , 5 ) ; 

35  GI  =  dofs  (  dofs  ( :  ,  6 )  >  0  , 6 ) ; 

36  PI  =  dofs  (  dofs  ( :  ,  7)  >  0 , 7) ; 

37  UI  =  dofs  (  dofs  ( :  ,  8 )  >  0  , 8 ) ; 
ss  UDI  =  dofs  (  dofs  (:  ,9)  >0,9); 

39  X  —  fern  .  sol  .  u  ; 

40  fern  .  equ  .  weak  =  ’—  g  *  g_test  ’; 

41  fern  .  xmesh  =  meshextend  ( fern  )  ; 

42  K  —  assemble  (fern  ,  ’out  ’  ,  ’K’ )  ; 

43  M  =  K(GI ,  GI )  ; 

44  randn  (  ’  seed  ’  ,  0); 

45  X(UDI)  =  X(UDI)  +  datanoise  *  max(  abs  (X(UDI ) ) )  *  randn  ( length  (UDI)  ,  1 ) ; 

46  X(  AI)  =  8.0; 

47  X(A0I)  =  1.0; 

48  fern .  equ .  weak  =  ’state  +  incstate  +  incadjoint  +  inccontrol  ’; 

49  for  iter  =  l:maxiter 

50  fern  .  xmesh  =  meshextend  ( fern  )  ; 

51  fern,  sol  =  femlin(fem,  ’Solcomp’,  {  ’  u  ’ }  ,  ’U’  ,  X); 

52  X(UI)  =  fern,  sol  .u(UI); 

53  if  (iter  =  1) 

54  [cost_old  ,  misfit  ,  reg  ,  logb  ]  =  ellipt  ic  _cost  _logb  (fern  ,X,  AI ,  AOI  ,mu)  ; 

55  end 

56  fern  .  xmesh  =  meshextend  ( fern  )  ; 

57  [K,  N]  =  assemble  ( fern  ,  ’U’  ,  X,  ’out’,  { ’K’  ,  ’  N  ’  }  )  ; 

58  W=  K(dUI,dUI); 

59  A  =  K(  dUI ,  dPI )  ; 

eo  C  =  K(dPI,dAI); 

61  R  =  K(dAI,dAI); 

62  ind  =  find  (sum(N( :  ,  dUI),l)~  =  0); 

63  A( :  ,  ind  )  =  0  ; 

64  A(  ind  ,  :  )  =  0  ; 

65  for  (k  =  1 :  length  ( ind  ) ) 

66  i  =  ind  ( k  )  ; 

67  A(  i  ,  i )  =  1 ; 

68  end 

69  chi  =  ones  (  size  (A,  1 )  ,  1 )  ; 

70  chi(ind)  =  0; 

71  AF  =  chol(A); 

72  X(PI)  =  AF  \  (AF’  \  (chi  .*  (W  *  (X(UDI)  -X(UI))))); 

73  MG  =  C’  *  X(PI)  +  R  *  X(AI)  ; 

74  RHS  =  AV1G  +  mu  .  /  (X(  AI )  -  X(  AOI ) )  ; 

75  X(GI)=M\  (MG-  mu*  (1./  (X(  AI )  -  X(  AOI ) )  ) )  ; 

76  gradnorm  =  sqrt  (postint  (fern,  ’g~2’,  ’U’  ,  X)); 

77  if  iter  =  1 

78  gradnorm_ini  =  gradnorm; 

79  end 
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tolcg  =  min(0.5,  sqrt  (  gradnorm/ gradnorm  _ini  ) )  ; 

P  =  R  +  le  —  10*eye  ( length  (AI ))  ; 

[D,flag,relres,CGiter,resvec]  =  peg  (@(V)elliptic_apply_logb(V,  chi  ,W,  .  .  . 

AF,C,R,X,  AI ,  AOI  ,mu)  ,RHS,  tolcg  , 3  00  ,P) ; 
nrCGiter  =  nrCGiter  +  CGiter; 

Xtry  =  X; 
if  mu  <=  0 

alpha  =  1; 

else 

idx  =  find  (D  <  0); 

Aviol  =  X( AI)  -  X(AOI)  ; 
if  min(Aviol)  <=  le— 9 

error(’point  is  not  feasible  ’); 

end 

if  ( isempty  ( idx  ) ) 
alpha  =  1; 

else 

alphal  =  min(  —  Aviol  ( idx  ). /D(  idx  ))  ; 
alpha  =  min  ( 1  ,  0. 9995*  alphal  )  ; 

end 

end 

descent  =  0; 
no_backtrack  =  0; 

while  (“descent  &&  no_backtrack  <  20) 

Xtry  ( AI )  =  X(AI)  +  alpha  *  D; 

Xtry  (dAI)  =  D; 

fem.xmesh  =  meshextend  ( fern  )  ; 

fern,  sol  =  femlin(fem,  ’Solcomp’,  ’u’,  ’U’  ,  Xtry); 

Xtry(UI)  =  fern  .  sol  .  u  (UI )  ; 

[cost,  misfit,  reg,  logb]  =  elliptic_cost_logb  (fern  ,  Xtry  ,  AI ,  AOI  ,mu)  ; 
if  (cost  <  cost_old  +  c  *  alpha  *  MG’  *  D) 
cost  _old  =  cost  ; 
descent  =  1; 

else 

no.backtrack  =  no_backtrack  +  1; 
alpha  =  rho  *  alpha  ; 

end 

end 

if  (descent ) 

X  |  Xtry; 

else 

error  (  ’  Linesearch  failed  \n  ’ )  ; 

end 

fern,  sol  =  femsol  (X)  ; 

a_update  m  sqrt  (postint  (fern,  ’  delt  a_a  "  2  ’ ) ) ; 
dist  =  sqrt  (  postint  (fern  ,  ’(  atrue  —  a)~2’)); 

if  ((a_update  <  tol  &&  iter  >1)  ||  gradnorm  <  tol) 

fprintf  (’  ***  GN  converged  after  %d  iterations.  ***\n’,  iter); 
break  ; 

end 

end 

The  function  that  implements  the  cost  function  evaluation  for  given  solution  vector  X 
is: 

function  [cost,  misfit,  reg  ,  logb  ]  =  elliptic  .cost  _logb  (fern  ,  X, 

AI,  AOI,  mu) 
fern,  sol  =  femsol(X); 

misfit  =  postint  (fern,  ’  0 . 5  *  ( u— ud  )  ~  2  ’ ) ; 
reg  =  postint  (fern,  ’  0 . 5  *gamma*  ( ax“2+ay  ~  2  )  ’ ) ; 
logb  =  mu  *  sum ( log  (X(  AI)— X(  AOI ) ) )  ; 
cost  =  misfit  +  reg  —  logb  ; 
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Next,  we  list  the  function  that  applies  the  reduced  Hessian  to  a  vector  V: 

1  function  GNV  =  elliptic_apply_logb(V,  chi,  W,  AF,  C,  R,  X,  AI ,  AOI  ,  mu) 

2  du  =  AF  \  (AF’  \  (chi  .*  (-C  *  V))); 

3  dp  =  AF  \  (AF’  \  (chi  .*  (-W  *  du ) ) )  ; 

4  Z  —  mu  *  spdiags(  1  .  /  (X(  AI)— X(  AOI  ))/2  ,  0,  length(AI),  length(AI)); 

5  GNV  =  C’  *  dp  +  (R  +  Z)*  V; 

B.3.  Conjugate  gradient  method  for  inverse  advective-diffusive  trans¬ 
port.  In  this  section,  we  give  the  implementation  for  the  conjugate  gradient  method 
(corresponding  to  a  single  step  of  the  Gauss-Newton-CG  method)  for  the  initial  con¬ 
dition  inversion  in  advective-diffusive  transport  as  described  in  Section  3. 

1  clear  all;  close  all; 

2  global  cgcount ;  cgcount  =  0; 

3  T  =  4;  Tm_start  =  1;  Tm  =  Tm_start  /  T; 

4  ntime  =  20; 

5  gammal  =  0;  %le  —  5; 

6  gamma2  =  1  e  —  6 ; 

7  datanoise  =  0.005; 

8  s 1  —  square2 (’ 1  ’,’ base corner  pos 0  0  ’})  5 

9  s 2  —  rect 2  (  ’  0 . 2 5  ’  ,  ’  0 . 2 5  ’  ,  ’ base  ’  ,  ’ corner  ’  ,  5 pos ’  , {  ’  0 . 2 5  ’  ,  ’  0 . 1 5  ’  }  ) ; 

10  s3  =  rect  2  (  ’  0 . 1  5  ’  ,  ’  0 . 2  5  ’  ,  ’  base  ’  ,  ’  corner  ’  ,  ’  pos  ’  ,  {  ’  0 . 6  ’  ,  ’  0 . 6  ’  }  ) ; 

n  fern  .  geom  =  si—  s2—  s3  ; 

12  fem.bnd.ind  =  [1  23444444445]; 

13  fem.pnt.ind  =  [1  22222222222]; 

14  fern  .  mesh  =  meshinit  ( fern  ,  5  hauto  ’  ,  3); 

15  fern  .  dim  =  {  ’  p  5  ’q’  ’u’  ’u0’  ’vl  ’  ’v2’}; 

16  fern .  shape  =  [2  1  2  2  2  2]; 

17  fern. const  =  {’Re’,  le2  ,  ’kappa’,  .001}; 

is  fern  .  expr  .  true  _init  =  ’  min  ( 0 . 5  ,  exp  (  — 100*  ( (x  —  .  35)  *  (x  —  .35)  +  (y  —  .  7)  *  (y  —  .  7 ) ) ) )  ’ ; 

19  fern  .  equ  .  weak  =  [’  —  (2/Re  *  (vlx  *  vlx_test  +  0 . 5  *  (  vly+v2x )  *  (  v  ly  _t  est+v2x_t  est  )  ’ 

20  ’+  v2y  *  v2y_test)  +  (vl  *  vlx  +  v2  *  v2x)  *  vl.test  ’... 

21  ’+  (vl  *  vly  +  v2  *  v2y)  *  v2_test  ’  .  .  . 

22  ’-  q  *  (vlx.test  +  v2y _test  )  -  q.test  *  (vlx  +  v2y)  )  ’]; 

23  fern .  bnd .  r  =  {{’vl’  ’v2  — 1’}  { ’vl  ’  ’v2’}  {’vl’  ’v2’}  {’vl’  ’v2’}  {’v2  +  l’  ’vl’}}; 

24  fern  .  pnt  .  constr  =  {{’q’}  {}}; 

25  fern  .  xmesh  =  meshextend  ( fern  )  ; 

26  fern. sol  m  femnlin(fem,  ’  Solcomp  ’  ,  {’q’,  ’vl  ’  ,  ’v2’}); 

27  figur  e  (’ Default  AxesFontSize  ’,  2  0 )  ;  box  on;  hold  on; 

28  postplot  (fern  ,  ’  arrowdata  ’  ,  {  ’  vl  ’  ,  ’  v2  ’  }  ,  ’  arrowxspacing  ’  ,  15  ,  ... 

29  ’  arrowy  spacing  ’  ,  15 ,  ’  axis  ’,[0,1, 0,1]);  axis  equal  tight; 

30  nodes  =  xmeshinfo  ( fern  ,’out’,  ’nodes’); 

31  dofs  =  nodes  .  dofs  ’  ; 

32  PI  =  dofs  (  dofs  (:,  1  )>  0 , 1 ) ; 

33  UI  =  dofs  (dofs  (:  ,3)  >0  ,3); 

34  U0I  =  dofs  (dofs  (:  ,4)  >0  ,4); 

35  X  —  fern  .  sol  .  u  ; 

36  fern .  equ .  weak  =  [’  —  (kappa  *  (ux  *  ux_test  +  uy  *  uy.test  )  ’... 

37  ’+  (vl  *  ux  +  v2  *  uy)  *  u.test  )  ’]  ; 

38  fern  .  xmesh  =  meshextend  ( fern  )  ; 

39  K  P  assemble(  fern  ,  ’U’  ,  X,  ’out’,  ’K’ )  ; 

40  N  =  K(  UI ,  UI )  ; 

41  fern  .  equ  .  weak  =  ’  — (ux  *  ux_test  +  uy  *  uy_test)’; 

42  fern  .  xmesh  =  meshextend  ( fern  )  ; 

43  K  =  assemble(  fern  ,  ’U’  ,  X,  ’out’,  ’K’ )  ; 

44  R  =  K(  UI ,  UI )  ; 

45  fern  .  equ  .  weak  =  ’—  (u  *  u_test  )  ’ ; 

46  fern  .  xmesh  =  meshextend  ( fern  )  ; 

47  K  =  assemble(  fern  ,  ’  out  ’  ,  ’K’ )  ; 

48  M  =  K(  UI ,  UI )  ; 

49  fern  .  equ  .  weak  =  ’O’; 
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50  fern .  bnd .  weak  =  {{}  {}  {}  {’  —  u  *  u_test  ’}  {}}; 

51  fem  .  xmesh  =  meshextend  ( fem  )  ; 

52  MB  =  assemble(  fem  ,  ’  out  ’  ,  ’K  ’ )  ; 

53  Q  =  MB(UI ,  UI )  ; 

54  dt  —  T  /  ntime  ; 

55  L  =  M  +  dt  *  N; 

56  fem  .  equ  .  weak  =  ’  —  (uO  *  uO_test  —  true_init  *  uO_test  ) 

57  fem  .  xmesh  =  meshextend  ( fem  )  ; 

58  fem.  sol  =  femlin(fem,  ’Solcomp’,  ’uO’); 

59  X(UOI)  —  fem  .  sol  .u(UOI); 

60  UU  =  zeros  ( length  ( UI )  ,  ntime); 

61  PP  =  zeros  ( length  ( PI )  ,  ntime); 

62  UD  =  zeros(length(UI),  ntime); 
es  UU  ( :  ,1)  =  X(UOI); 

64  for  k  =  1:(  ntime—  1) 

es  UU  ( :  ,  k+1)  =  L  \  (M  *  UU  ( :  ,k)); 

66  end 

67  randn  (  ’  seed  ’  ,  0 ) ; 

68  for  i t  =  1 :  ntime 

69  UD(:,  it)  =UU(:,  it)  +  datanoise  *  max(  abs  (UU  ( :  ,  it)))  *... 

70  randn  ( length  (UU ( :  ,  i t  ) )  ,  1 ) ; 

71  end 

72  F  =  Q  *  (dt  *  UD(:  ,  ntime  ))  ; 

73  PP(:  ,  ntime)  =  L’  \  F; 

74  for  k  =  (ntime —1): —1:2 

75  m  =  ( k  >  Tm  *  ntime  )  ; 

76  F  =  M  *  PP ( :  ,  k+1)  +  m  *  Q  *  (dt  *  UD( :  ,  k  ) )  ; 

77  PP  ( :  ,  k )  =  L  ’  \  F  ; 

78  end 

79  PP  ( :  ,  1 )  =  M  *  PP  (  :  ,  2  )  ; 

so  Prec  =  gamma2  *  R  +  max(le  — 11,  gammal)  *  M; 

si  [UO,  flag  ,  relres  ,  CGiter  ,  resvec]  =  peg  (@(U0)  ad_apply  (UO ,  L,  M,  R,  .  .  . 

82  Q,  dt  ,  Tm,  ntime,  gammal,  gamma2 ,  UU,  PP)  ,  PP  ( :  ,  1 )  ,  le—  4,  1000,  Prec) 

The  function  that  applies  the  reduced  Hessian  to  a  vector  is  listed  below: 

1  function  out  =  ad_apply(U0,  L,  M,  R,  Q,  dt  ,  Tm,  ntime,  gammal,  gamma2 , 

2  UU,  PP) 

3  global  egeount  ; 

4  UU  ( :  ,1)  =  U0; 

5  for  k  =  l:(ntime—  1) 

e  UU  ( :  ,  k+1)  =  L  \  (M  *  UU  ( :  ,  k  ) ) ; 

7  end 

8  F  =  Q  *  (—  dt  *  UU ( :  ,  ntime  ))  ; 

9  PP(:  , ntime)  =  L’  \  F; 

10  for  k  =  (ntime —1): —1:2 

n  m  =  ( k  >  Tm  *  ntime  )  ; 

12  F  =  M  *  PP  ( :  ,  k+1)  +  m  *  Q  *  (—  dt  *  UU ( :  ,  k  ) )  ; 

is  PP  ( :  ,  k )  =  L  ’  \  F; 

14  end 

is  MP( :  ,1)  =  M  *  PP(: ,2); 

16  out  =  —  MP(:  ,1)  +  gammal  *  M  *  U0  +  gamma2  *  R  *  U0; 

17  egeount  =  egeount  +  1; 

B.4.  Gauss-Newton-CG  for  nonlinear  elliptic  inverse  problems.  In  what 
follows,  we  list  the  complete  code  for  the  Gauss-Newton-CG  method  applied  for  the 
nonlinear  elliptic  inverse  problem,  in  which  we  invert  for  volume  and  boundary  source 
terms  as  described  in  Section  4. 

1  clear  all;  close  all; 

2  gl=circ2  (  ’1  5  ,  ’base  ’  ,  ’center  ’  ,  ’pos’  ,{  ’0  ’  ,  ’0’}  ,  ’rot  ’  ,  ’0  ’); 

3  theta  =  [ 0  :  2  *  pi / 6 : 2* pi ] ; 
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4  npts  =  5; 

5  k  —  1 ;  s  =  ’gl  ’  ; 

6  for  i  =  1 :  npts 

7  if  i  =  npts 

8  theta  =  [ 0 : 2 *  pi /32 : 2 *  pi ]  ; 

9  end 

10  for  j  =  1 :  length  ( theta)  — 1 

n  x(j,i)  sa  (  i /(  npts +  1))*  cos  ( theta  ( j  ))  ; 

12  y(j,i)  =  (  i /(  npts  +  1))*  sin  ( theta  (j  ))  ; 

13  eval  ( [  ’  p  ’  ,  num2str(k),  ,=  point2(x(’,num2str(j 

14  num2str  (  i  )  ,  ’ )  ,  y  (  ’  ,  num2str  ( j  )  ,  5  ,  ’  ,  num2str  (i)  ,  ’));  ’]); 

is  s  =  [s,  ’,  p’,  num2str(k)]; 

16  k  =  k  +  1; 

17  end 
is  end 

19  eval  ( [  ’  fem  .  geom  =  geomcoerce  (  ’  ’  solid  s  ,  ’  }  )  ;  ’]); 

20  figure;  geomplot  (fem .  geom  ,  ’edgelabels’,  ’on’,  ’pointlabels’,  ’on’); 

21  sind  =  [];  spnt  =  [];  smesh  =  [];  totpnts  =  k  +  3; 

22  for  it  =  l:totpnts 

23  if  it  =  1  II  it  =  totpnts/2  — 1  ||  it  =  totpnts/2— 1+3  ||  it  =  totpnts 

24  ind  =  1; 

25  else 

26  ind  =  2; 

27  spnt  =  [spnt,  ’  num2str  (  it  )  ]  ; 

28  smesh  =  [smesh,  num2str(it),  ’,  0.1,  ’]; 

29  end 

30  sind  =  [sind,  5  ’,  num2str  ( ind  )  ]  ; 

31  end 

32  eval  ( [  ’  fem  .  mesh=meshinit  ( fem  ,  ’  ’  hauto  5  5  ,  7  ,  ’  ’  hmaxvtx  5  5  ,  [  ’  ,  smesh  ,  ’  ]  )  ;  ’  ]  ) 

33  eval  ( [  ’  fem  .pnt.ind  ==  [  ’  ,  sind  ,  ’  ]  ;  ’  ]  )  ; 

34  eval  ( [  ’  pts  =  [  ’  ,  spnt  ,  ’  ]  ;  ’  ]  )  ; 

35  eval  ( [  ’  fem  .  geom  =  geomcoerce  (  ’  ’  solid  s  ,  ’  }  )  ;  ’]); 

36  figure  ,  meshplot  ( fem  )  ; 

37  fem  .  dim  =  {  ’  delt a_f  5  ’  delta.g  ’  ’  delta.p  ’  ’  delta  _u  ’  ’  f  ’  ’g  ’  ’grad’ 

38  ’  p  ’  ’  u  ’  ’  ud  ’  }  ; 

39  fem .  shape  =  [1  12211122  2]; 

40  fem  .  equ  .  expr  .  ft  rue  =  ’1  +  3*(y>  =  0)’; 

41  fem  .  bnd  .  expr  .  gtrue  gft  ’x  +  y’; 

42  datanoise  =  0.01; 

43  fem  .  equ  .  expr  . gammal  =  ’le— 5’; 

44  fem  .  equ  .  expr  .  gamma2  =  ’  le  —  4’; 

45  maxiter  =  100; 

46  tol  =  le— 7; 

47  ntol  ■  le— 9; 

48  fem.  const  =  {  ’ c  ’  ,  1000}; 

49  cl  =  le— 4; 

50  rho  =  0.5; 

51  nrCGiter  =  0; 

52  fem  .  equ  .expr.  goal  =  [’  —  (udx*  udx_t  est+udy*  udy  _t  est+ud*  ud_test  +  ’... 

53  c*ud~3*  ud.test  —  ftrue  *  ud.test  )  ’]; 

54  fem  .  equ  .  expr  .  st at e  =  [  ’  —  (ux*  ux_test+uy*  uy  _test+u*  u.test  +  ’ . . . 

55  ’  c*u  "3*  u_test  —  f  *  u  _t  est  )  ’]; 

56  fem  .  equ  .  expr  .  adj oint  =  [  ’  —  (px*  px_test+py*  py  _test+p*  p_test  +  ’ . . . 

57  ’3*c*u"2*p*  p_test  )  ’]; 

58  fem  .  equ  .  expr  .  incst ate  =  [  ’  —  (  delta_ux*  delt a_px_test  +  ’ . . . 

59  ’delta_uy*  de  It  a  _py  _t  es  t+delt  a_u  *delta_p_test  +  ’... 

60  ’3*  c*u  "2*  delt  a_u  *  delt  a_p  _t  est  — de  1 1  a  _f  *  de  It  a  _p  _t  est  )  ’]; 

61  fem  .  equ  .  expr  .  incadj  oint  =  [  ’  —  (  delta_px*  delt  a_ux_test  +  ’ . . . 

62  ’delta.py*  delta.uy  _tes  t+delt  a_p  *  delta_u_test  +  ’... 

63  ’3*c*u"2*  delta_p*  delta_u_test  )  ’]; 

64  fem  .  equ  .  expr  .  inccontrolf  =  [  ’  —  (gammal*  (  d e  1 1  a _fx  *  d e  1 1  a _f x  _t  e s t  +  ’ . . . 

65  ’  delta_fy  *  delt  a_fy_test )—  delt  a_f_test  *delta_p  ’... 
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66  ’+gammal*(  fx*delta_fx_test+fy*delta_fy_t 

67  fern  .  bnd  .  expr  .  inccontrolg  =  [  ’  ’  ]  ; 

68  fem  .  equ  .  weak  =  ’goal 

69  fem  .  bnd  .  weak  =  {{’  — gtrue  *  ud.test  ’}}; 

70  fem  .  xmesh  =  meshextend  ( fem  )  ; 

71  fem.  sol  =  femnlin(fem,  ’  Solcomp  ’  ,  ’ud’,  ’ntol’, 

72  nodes  =  xmeshinfo  ( fem  , ’out’,  ’nodes’); 

73  dofs  =  nodes  .  dofs  ’  ; 

74  dFI  =  dofs  (  dofs  (:,  1  )>  0 , 1 ) ; 

75  dGI  =  dofs  (  dofs  ( :  ,  2 )  >  0  , 2 ) ; 

76  dPI  =  dofs  (  dofs  ( :  ,  3 )  >  0  , 3 ) ; 

77  dUI  =  dofs  (  dofs  ( :  ,  4 )  >  0  , 4 ) ; 

78  FI  =  dofs  (  dofs  ( :  ,  5 )  >  0  , 5 ) ; 

79  GI  =  dofs(dofs(:,6)>0,6); 
so  GRI  =  dofs(dofs(:,7)>0,7); 
si  PI  =  dofs(dofs(:  ,8)  >0,8); 

82  UI  =  dofs(dofs(:  ,9)  >0,9); 

83  UDI  m  dofs  (  dofs  (:,  1 0)  >  0 , 1 0) ; 

84  X  —  fem  .  sol  .  u  ; 

85  fem .  equ .  weak  =  ’—  grad  *  grad_test  ’; 

86  fem  .  xmesh  =  meshextend  ( fem  )  ; 

87  K  =  assemble(  fem  ,  ’  out  ’  ,  ’K’ )  ; 

88  M  =  K(GRI ,  GRI ) ; 

89  fem  .  equ  .  weak  =  ’O’; 

90  fem  .  bnd  .  weak  =  ’O’; 

91  fem. pnt. weak  =  {{},{’  —  p  *  p_test  ’}}; 

92  fem  .  xmesh  =  meshextend  ( fem  )  ; 

93  K  =  assemble(  fem  ,  ’  out  ’  ,  ’K’ )  ; 

94  B  =  K(  PI  ,  PI )  ; 

95  fem  .  equ  .  weak  =  ’O’; 

96  fem  .  bnd  .  weak  =  {{’  —  g  *  g_test  ’}}; 

97  fem  .  xmesh  =  meshextend  ( fem  )  ; 

98  K  =  assemble(  fem  ,  ’  out  ’  ,  ’K’ )  ; 

99  QP  =  K(GI,GI); 

100  randn  (  ’seed  ’  ,  0); 

101  X(UDI)  =  X(UDI)  +  datanoise  *  max(  abs  (X(UDI ) ) )  * 

102  X(FI)  =  3.0; 
los  X(GI)  =  0.0; 

104  fem.  sol  =  femsol(X); 

105  fem .  equ .  weak  =  ’state  +  incstate  +  incadjoint  + 

106  fem  .  bnd  .  weak  =  {{’—  gamma2*  delta_g*  delta_g_test  ’. 

107  ’  —  delta_p*delta_g_test  —  delta_g*delta_p_test  - 
los  for  iter  =  l:maxiter 

fem  .  xmesh  =  meshextend  ( fem  )  ; 

fem.  sol  =  femnlin(fem,  ’Solcomp’,  {  ’  u  ’  }  ,  ’U’ 

X(UI)  =  fem.  sol  .u(UI); 
if  (iter  =  1) 

[cost_old  ,  misfit  ,  reg]  =  nl_cost(fem,  X, 

end 

fem  .  xmesh  =  meshextend  ( fem  )  ; 

K  =  assemble(  fem  ,  ’out’,  ’K’  ,  ’U’  ,  X); 

W=  K(dUI,dUI); 

A  =  K(  dUI ,  dPI )  ; 

Cl  =  K(  dPI  ,  dFI )  ; 

R1  =  K(  dFI,  dFI); 

R2  =  K(dGI ,  dGI )  ; 

C2  =  K(  dPI  ,  dGI )  ; 

AF  =  chol  (A)  ; 

X(PI)  =  AF  \  (AF’  \  (B  *  (X(UDI)  -X(UI)))); 
MGF  =  Cl  ’  *  X(  PI )  +  R1  *  X(  FI )  ; 

MGG  =  C2  ’  *  X(PI)  +  R2  *  X(GI); 

MG  =  [MGF;  MGG]  ; 
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e  s  t )—  d  e  1 1  a  _f  _t  e  s  t  *p  )  ’]; 

nt ol ) ; 


randn  ( length  (UDI )  ,1); 

inccont rolf  ’  ; 

-g*  u.test  ’  }  }  ; 

,  X,  ’  ntol  ’  ,  nt  ol  )  ; 

pts  )  ; 
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X(GRI)  =  M  \  MGF; 

gradnormf  =  sqrt  (postint  (fem,  ’grad“2’,  ’U’  ,  X)); 

X(GRI)  =  QP  \  MGG; 

gradnormg  =  sqrt  (postint  (fem,  ’grad  "2’,  ’U’  ,  X)); 

if  iter  =  1 

gradnormf_ini  =  gradnormf; 
gradnormg_ini  =  gradnormg; 

end 

tolcg  =  sqrt  ((  gradnormf/ gradnormf  _ini  )"  2  ..  . 

+  (  gradnormg/ gradnor  mg_ini  )~  2 ) ; 
tolcg  =  min(0.5,  sqrt  ( tolcg  ))  ; 

P  =  [R1  zeros  ( length  (GI));  zeros (length (FI))  R2] 

+  le— 10*eye  ( length  (GI)  +  length(FI)); 

[D,flag,relres,CGiter,resvec]  =  peg  (@(V)nl_apply  (V,B,  AF,  Cl  ,  .  .  . 

C2,  R1,R2), -MG,  tolcg  ,400  ,P)  ; 
nrCGiter  =  nrCGiter  +  CGiter  ; 

Xtry  =  X; 
alpha  =  1.0; 
descent  =  0; 
no.backtrack  =  0; 

while  (“descent  &&  no_backtrack  <  20) 

Xtry  (  FI )  =  X(FI)  +  alpha  *  D(  1 :  length  (FI ))  ; 

Xtry(dFI)  =  D(l:length(FI)); 

Xtry  ( GI)  =  X(GI)  +  alpha  *  D(  length  (FI)  +  l:end  )  ; 

Xtry(dGI)  =  D(length(FI)  +  l:end); 
fem.xmesh  =  meshextend  ( fem  )  ; 

fem.  sol  =  femnlin(fem,  ’Solcomp’,  ’u’,  ’U’  ,  Xtry,  ’ntol’,  ntol); 

Xtry(UI)  =  fem  .  sol  .  u  (UI )  ; 

[cost,  misfit,  reg]  =  nl_cost  (fem  ,  Xtry  ,  pts  )  ; 
if  (cost  <  cost_old  +  cl  *  alpha  *  MG’  *  D) 
cost  _old  =  cost ; 
descent  =  1; 

else 

no_backtrack  =  no_backtrack  +  1; 
alpha  =  rho  *  alpha ; 

end 

end 

if  (descent) 

X  =  Xtry; 

else 

error  (’ Linesearch  failed  \n  ’ )  ; 

end 

fem.  sol  =  femsol  (X)  ; 

f_update  =  sqrt  (postint  (fem,  ’  delt  a_f  ~  2  ’ ) ) ; 
g_update  =  sqrt  (postint  (fem,  ’  delt  a_g  ~  2  ’ ) ) ; 
dist  =  sqrt  (  postint  (fem  ,  ’(ftrue  —  f  )  “  2  ’ ) ) ; 
graddir  =  sqrt  (—MG’  *  D)  ; 

if  (f_update  <  tol  &&  g_update  <  tol)  &&  (iter  >1)  ||... 

(gradnormf  <  tol  &&  gradnormg  <  tol) 
fprintf  (’  ***  GN  converged  after  %d  iterations.  ***\n’,  iter); 
break ; 

end 


The  function  that  evaluates  the  cost  function  for  a  given  solution  vector  is: 

1  function  [cost  ,  misfit  ,  reg]  =  nl_cost(fem,  X,  pts) 

2  misfit  =  0 ; 

3  for  it  =  1 :  length  (  pts  ) 

4  misfit  s=  misfit  +  postint  (fem,  ’0 . 5*  (  u— ud )  ~  2  ’  ,  ’  dl  ’  ,  pts(it),  ... 

5  ’edim  ’  ,0  ,  ’U’  ,  X)  ; 

6  end 

7  regl  =  postint  (fem,  ’  0 . 5  *gammal  *  (  fx  *  fx+fy  *  fy  )  ’); 
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8  reg2  :  postint  (fern  ,  ’  0 . 5  *  gamma2*g*g  ’,’dl’,[l,2,3,4],  ’edim  ’  ,  1 )  ; 

9  reg  =  regl  +  reg2 ; 

10  cost  =  misfit  +  reg  ; 

Finally,  the  function  that  applies  the  reduced  Hessian  to  a  vector  V  is  as  follows: 

1  function  GNV  =  nl_apply(V,  B,  AF,  Cl,  C2 ,  R1 ,  R2) 

2  [  n  ,m]  =  size  (R1 )  ; 

3  VI  =  V(  1 :  n  )  ; 

4  V2  =  V(n  +  l:end); 

5  du  =  AF  \  (AF’  \  (-C1  *  VI  -  C2  *  V2)); 

e  dp  =  AF  \  (AF’  \  (— B  *  du)); 

r  GNV1  =  Cl’  *  dp  +  R1  *  VI; 

s  GNV2  =  C2’  *  dp  +  R2  *  V2; 

9  GNV  =  [GNV1 ;  GNV2]  ; 
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